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ABSTRACT 


It  has  been  argued  that  knowledge-based  systems  (KBS)  must  reason  from  evidential 
information  -  i.e.,  information  that  is  to  some  degree  uncertain,  imprecise,  and  occasionally 
inaccurate.  This  is  no  less  true  of  KBS  that  operate  in  the  domain  of  computer-based  image 
interpretation.  Recent  research  has  suggested  that  the  work  of  Dempster  and  Shafer  (DS) 
provides  a  viable  alternative  to  Bayesian-based  techniques  for  reasoning  from  evidential 
information.  In  this  paper,  we  discuss  some  of  the  differences  between  the  DS  theory  and 
some  popular  Bayesian-based  approaches  to  effecting  the  reasoning  task.  We  then  discuss 
some  work  on  integrating  the  DS  theory  into  a  knowledge-based  high-level  computer  vision 
system  in  order  to  examine  various  aspects  of  this  new  technology  that  have  not  been 
explored  to  date.  Results  from  a  large  number  of  image  interpretation  experiments  will 
be  presented.  These  results  suggest  that  a  KBS’s  performance  improves  substantially 
when  it  exploits  various  features  of  the  DS  theory  that  are  not  readily  available  in  pure 
Bayesian-based  approaches. 


Index  Terms:  uncertain  reasoning,  evidential  reasoning,  belief  functions, 
knowledge-based  system,  computer  vision,  image  interpretation. 
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1  INTRODUCTION 

It  is  widely  accepted  that  knowledge  based  systems  (KBSs)  that  operate  in  complex 
domains  must  “reason”  from  information  that  is  to  some  degree  uncertain,  imprecise, 
and  occasionally  inaccurate,  called  “evidential”  information  [25].  Furthermore,  each  body 
of  information  is  usually  generically  distinct  and  is  typically  obtained  from  a  variety  of 
disparate  sources,  commonly  called  knowledge  sources  (KSs).  The  evidential  information 
that  KSs  provide  is  derived,  in  part,  from  imperfect  perceptions  of  their  environment.  And 
as  such,  can  be  viewed  as  partial  evidence  for  or  against  the  occurrence  of  semantically 
meaningful  events  in  some  domain  of  interest.  Given  this  reality,  the  degree  to  which  a 
KBS  successfully  deals  with  real  world  problems  depends,  in  part,  on  the  technology  it 
employs  to  reason  from  evidential  information. 

In  this  paper,  we  are  concerned  with  the  integration  and  evaluation  of  a  technology 
that  KBSs  might  use  to  complete  two  fundamental  tasks.  One  task  is  to  reason  from 
evidential  information  in  order  to  interpret  (i.e.,  understand)  the  perceptions  of  its  KSs. 
The  second  is  to  decide  how  to  allocate  its  limited  resources  in  order  to  successfully 
complete  the  previous  task.  That  is,  we  must  anticipate  that  the  complexity  of  the  real 
world  prohibits  a  KBS  from  understanding  its  perceptions  in  one  fell  swoop.  Rather, 
“control-related”  information  must  be  obtained  in  order  to  help  make  decisions  about  the 
type,  nature,  quality,  and  quantity  of  the  information  that  is  required  to  interpret  the 
perceptions  of  KSs.  In  the  work  reported  here,  the  control-related  information  that  a 
KBS  must  reason  from  is  provided  by  control  knowledge  sources  (CKSs).  Similar  to  KSs, 
the  information  that  CKSs  provide  is  derived,  in  part,  from  their  perceptions  of  the  state 
of  the  system  and  or  the  environment.  As  a  consequence,  such  control  related  information 
is  also  evidential  in  nature.  Thus,  KBSs  will  be  more  successful  at  understanding  their 
perceptions  to  the  degree  they  employ  technologies  that  are  better  suited  than  current 
techniques  for  reasoning  from  limited  evidential  information. 
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Recent  research  indicates  that  Dempster’s  rule  for  combining  beliefs,  and  Shafer’s 
theory  of  belief  functions  shows  promise  as  a  more  viable  alternative  than  some  popular 
Bayesian  based  techniques  for  addressing  these  problems  [7],  [34],  [26],  [14],  [8],  [27],  [15], 
[38],  [l],  [16],  [17],  [41],  [42],  [43],  [44].  In  addition,  the  work  of  Dempster  and  Shafer  is 
the  basis  of  a  developing  concept  called  “evidential  reasoning”  (ER)  [25],  This  concept, 
which  we  shall  discuss  later,  is  the  foundation  of  our  framework  for  addressing  the  above 
interpretation  and  control  problems  in  our  domain  of  interest. 

It  is  clear  that  the  problem  of  reasoning  from  evidential  information  is  common  to 
many  KBSs  that  operate  in  complex  domains.  However,  we  have  chosen  the  task  domain  of 
high-level  computer-based  image  interpretation  as  the  context  within  which  to  discuss  and 
present  some  of  our  work.  Within  this  context,  we  shall  discuss  research  on  the  application 
of  the  DS  and  ER  technologies  to  a  KBS  that  is  designed  to  interpret  two-dimensional 
monocular  color  images  of  outdoor  natural  scenes. 

We  begin  the  discussion  by  stating  a  major  objective  of  general  purpose  high-level 
knowledge-based  image  interpretation  systems.  Next  we  briefly  describe  the  “origins”  of 
the  evidential  information  that  an  image  interpretation  system  must  reason  from  in  order 
to  complete  its  tasks.  Then  we  describe  some  of  the  difficulties  with  using  probabilistic- 
based  approaches  for  reasoning  from  evidential  information.  Following  this  discussion,  we 
shall  introduce  Shafer’s  theory  of  belief  functions,  Dempster’s  rule,  and  contrast  it  with 
some  aspects  of  probability  theory.  Next,  we  shall  acquaint  the  reader  with  Lowrance’s  and 
Garvey’s  concept  of  evidential  reasoning  (ER)  [25].  And  after  introducing  this  concept, 
we  shall  describe  our  high-level  knowledge-based  image  interpretation  system  that  was 
built  to  employ  and  explore  some  aspects  of  both  the  Dempster-Shafer  (DS)  theory  and 
ER  technology.  Finally,  results  of  interpretation  experiments  will  be  presented  followed 
by  a  discussion  of  related  and  future  work  in  this  area. 

We  mention,  here,  that  our  emphasis  throughout  this  paper  shall  be  on  the  underlying 
technology  a  KBS  might  use  to  reason  from  evidential  information.  For  examples  and  a 
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more  extensive  review  of  computer  vision  systems  see,  for  instance,  Nagao  and  Matsuyama 
[28],  Binford  [3],  Havens  and  Mackworth  [20],  Selfridge  [33],  Brooks  [4]  Sloan  [37],  Hanson 
and  Riseman  [19],  Levine  [23],  and  Levine  and  Shaheen  [24]. 

2  IMAGE  INTERPRETATION  OBJECTIVES 

An  example  of  a  typical  complex  outdoor  natural  scene  that  a  general  purpose 
knowledge-based  image  interpretation  system  might  be  expected  to  understand  is  shown 
in  Figure  1.  An  objective  of  such  systems  is  to  identify  semantically  meaningful  visual 
entities  in  a  digitized  and  segmented  image  of  some  scene.  That  is,  to  correctly  assign 
semantically  meaningful  labels  (e.g.,  house,  tree,  grass,  and  so  on)  to  regions  in  an  image 
-  see  [29],  [30].  A  computer-based  image  interpretation  system  can  be  viewed  as  having 
two  major  components,  a  “low-level”  component  and  a  “high-level”  component  [19],  [31]. 
In  many  respects,  the  low-level  portion  of  the  system  is  designed  to  mimic  the  early  stages 
of  visual  image  processing  in  human-like  systems.  In  these  early  stages,  it  is  believed  that 
scenes  are  partitioned,  to  some  extent,  into  regions  that  are  homogeneous  with  respect  to 
some  set  of  perceivable  features  (i.e.,  feature  vector)  in  the  scene  [6],  [40],  [39].  To  this 
extent,  most  low-level  general  purpose  computer  vision  systems  are  designed  to  perform 
the  same  task.  An  example  of  a  partitioning  (i.e.,  segmentation)  of  Figure  1  into  homoge¬ 
neous  regions  is  shown  in  Figure  2.  The  knowledge-based  computer  vision  system  we  shall 
describe  in  this  paper  is  not  currently  concerned  with  resegmenting  portions  of  an  image. 
Rather,  its  task  is  to  correctly  label  as  many  regions  as  possible  in  a  given  segmentation. 

It  is  clear  that  no  segmentation  is  perfect.  There  will  be  regions  that  overlap  se¬ 
mantically  distinct  visual  entities.  Or  there  might  be  regions  that  are  over  segmented  - 
i.e.,  multiple  regions  that  partition  a  single  semantic  entity.  These  anomalies  are  due, 
in  part,  to  several  unavoidable  realities  of  the  visual  domain.  Imaging  machinery  will 
simultaneously  lose  meaningful  information  and  introduce  bogus  information  -  e.g.,  noise 
and  or  distortion.  Thus,  the  data  from  which  a  segmentation  must  be  produced  is  an 
imperfect  abstraction  of  the  scene  a  system  is  expected  to  understand.  Second,  semantic 
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information  about  objects  in  a  scene  cannot  be  contained  entirely  in  the  image  data.  And 
as  a  consequence,  some  regions  will  partition  a  single  visual  entity.  And  still  other  regions 
may  enclose  multiple  semantically  distinct  visual  entities. 

In  the  KBS  we  shall  be  describing,  KSs  extract  a  variety  of  image  feature  information 
from  a  subset  of  regions  in  a  segmented  image  -  e.g.,  spectral,  texture,  shape,  and  spatial 
attributes  of  regions.  Based  on  their  perceptions,  KSs  form  opinions  about  the  presence 
and  or  absence  of  features  they  are  capable  of  observing.  What  logically  follows  is  that 
beliefs  that  are  based  on  these  opinions  will  be  imperfect.  And  at  best  such  opinions  can 
be  viewed  as  only  evidence  to  suggest  the  presence  or  absence  of  semantically  meaningful 
entities  in  a  particular  scene  of  interest. 

Given  this  reality,  how  might  a  system  represent  the  evidential  information  it  obtains 
from  KSs?  And  how  might  a  KBS  reason  from  this  evidence  more  effectively  than  current 
approaches  permit?  Let  us  begin  to  answer  this  question  by  briefly  reviewing  current 
approaches. 

2.1  CURRENT  APPROACHES  TO  REASONING 

Some  of  the  problems  with  reasoning  from  evidential  information  are  common  to  many 
domains  other  than  computer-based  image  interpretation.  Many  of  the  currently  popular 
approaches  for  addressing  them  are  probabilistic  in  nature.  That  is,  probabilities  are  used 
to  represent  belief  in  propositions  and  Bayes  rule  or  an  ad  hoc  variant  thereof  is  typically 
used  to  update  a  system’s  belief  in  propositions  based  on  new  bodies  of  evidence.  See, 
for  instance,  the  work  on  systems  like  VISIONS  [19],  Prospector  [10],  MYCIN  [36],  and 
Gorry’s  computer-aided  medical  diagnosis  system  [18].  The  problems  with  probabilistic 
based  approaches  are  well  known  and  continue  to  be  discussed  in  the  literature  [35],  [26], 
[21],  However,  let  us  briefly  state  a  few  of  the  problems  that  motivated  our  exploration 
into  alternative  theories. 
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One  concern  with  probabilistic  based  models  is  with  their  voracious  appetite  for  data. 
Where  H  and  e  represent  some  hypothesis  and  body  of  evidence,  respectively,  in  order 
to  use  the  inversion  formula  of  Bayes  rule  we  must  have  some  prior  belief  P(H)  ,  P(e) , 
and  likelihood  P(e| E) .  It  is  well  known  that  the  complexity  of  real  world  domains  makes 
it  difficult,  at  best,  to  obtain  or  reliably  estimate  such  beliefs  and  likelihoods.  Some  have 
countered  by  saying  that  a  complete  probability  specification  is  not  required  [32].  Rather, 
one  need  only  estimate  the  odds  or  likelihood-ratio  -  see  [32].  Our  concerns  with  this  view 
is  that  a  large  number  of  likelihood-ratios  still  must  be  provided,  and  that  the  problems 
of  not  having  a  uniform  represention  of  ignorance  and  being  able  to  distinguish  disbelief 
from  no  belief  also  remain. 

In  a  probabilistic  approach,  one  typically  represents  ignorance  in  a  set,  say  ©  ,  of 
mutually  disjoint  propositions  by  the  following  probability  distribution  Pq  : 


po{6)  =  TSj-. 

flee  lel 


(l) 


If  it  becomes  necessary  to  change  ©  to  0' ,  where  |0|  |©'| ,  then  our  numerical 

representation  of  ignorance  must  also  change.  Such  a  change  might  have  been  induced  by 

the  acquisition  of  additional  evidence.  If  this  is  the  case,  by  what  theory  does. one  reconcile 

or  interpret  the  disparity  in  the  representation  of  ignorance  in  ©  and  ©'  ?  It  is  important 

that  a  system  capable  of  dealing  with  such  disparities,  particularly  when  it  happens  to  be 

equally  ignorant  about  some  6  €  0  ,  and  the  same  6  €  ©' ,  but  Po(9)  ^  Pq{0)  . 

flee  see' 


The  additivity  constraint 

P(A)  +  P(->A)  =  1  (2) 

imposes  some  undesirable  restrictions  on  a  system’s  ability  to  distinguish  disbelief  horn  no 
belief  in  the  truthfulness  or  falseness  of  a  proposition.  If  our  belief  in  A  happens  to  be,  for 
instance  P(A)  =  x  ,  then  we  are  forced  to  believe  to  a  degree  of  1  —  P(A)  =  P(-<A)  =  1  —  x 
that  ->A  is  true.  It  might  be  the  case,  however,  that  we  have  no  evidence  to  indicate  ->A 
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is  true  or  A  is  false  to  any  degree.  Thus,  we  must  adopt  beliefs  for  which  we  have  no 
evidence  to  support.  And  as  a  consequence,  we  cannot  distinguish,  in  a  nice  single  formal 
representation,  disbelief  from  no  belief. 

To  summarize  our  concerns,  a  pure  probabilistic  based  approach  to  reasoning  in  com¬ 
plex  domains  is  overly  restrictive.  It  is  difficult  to  specify  a  complete  probability  dis¬ 
tribution  over  the  propositions  of  interest  due  to  the  enormous  number  of  micro  events 
that  must  be  taken  into  account.  And  as  a  consequence,  pure  probabilistic  approaches 
are  typically  compromised  by  making  ad  hoc  modifications  to  Bayes’  rule  in  order  to  deal 
with  these  restrictions  -  see  for  instance  [9],  [10].  A  formal  and  uniform  representation  of 
ignorance  remains  unavailable.  And  we  cannot  distinguish  disbelief  from  no  belief.  These 
are  a  few  of  the  unwarranted  constraints  that  have  motivated  us  to  seek  alternative  ap¬ 
proaches  to  these  problems.  The  results  of  our  efforts  have  led  us  to  investigate  some  of 
the  work  of  Arthur  Dempster  and  Glenn  Shafer  -  commonly  called  the  Dempster-Shafer 
(DS)  theory  [7],  [34]. 


3  THE  DEMPSTER-SHAFER  THEORY 

We  can  view  the  problem  of  reasoning  in  complex  domains  as  one  of  trying  to  answer 
a  particular  question  of  interest.  For  instance,  in  the  computer  vision  domain  a  system 
might  be  asked  to  solve  the  problem  of  identifying  an  object  in  some  region  of  interest.  A 
typical  question  might  be  which  of  the  following  disjoint  propositions,  say  0\  and  &2  is 
true:  The  region  is  a  house  (0j)  or  The  region  is  a  barn  [82)  ?  Or  in  other  words,  which 
label  hypothesis,  house  or  barn,  should  be  associated  with  the  current  region  of  interest? 
A  system  must  answer  this  question,  in  part,  by  obtaining  and  pooling  the  appropriate 
beliefs  that  would  allow  it  to  discern  a  house  from  a  barn.  Then  the  problem  is  solved  - 
i.e,,  the  question  is  answered  -  to  the  extent  a  system  is  capable  of  successfully  obtaining 
and  pooling  such  beliefs. 
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In  Shafer’s  theory,  the  degree  of  belief,  Bel ,  that  one  should  accord  a  proposition  is 
represented  as  a  number  between  zero  and  one.  Suppose  ©  is  a  finite  set,  and  we  denote 
the  set  of  all  subsets  of  ©  by  £jP(0) .  Then  if  Bel  satisfies  the  following  conditions: 

(1)  Bel{<&)  =  0. 

(2)  Bel(Q)  =  1 . 

(3)  For  every  positive  integer  n  and  every  collection  A\ ,  . . . ,  An  of  subsets  of  ©  , 

Bel{Ax\J. .  .UA„)  >  £  Bel{Ai)  Bel(AinAj)  +  -  . .  .  +  (-l)n+1Bef(J4in. .  .nA„),  (3) 
i  i<j 

then  Bel  is  a  belief  function  over  © .  Within  the  context  of  Shafer’s  theory  and  with 
respect  to  this  simple  example,  0  =  {#i,  9 2 }  and  is  called  a  frame  of  discernment.  It 
is  clear  that  what  constitutes  ©  and  how  it  is  used  is  crucial  to  the  success  of  problem 
solving  systems.  Therefore,  let  us  provide  more  background  about  a  frame  of  discernment 
before  returning  to  our  discussion  of  belief  functions. 

3.1  FRAME  OF  DISCERNMENT 

Suppose  we  are  presented  with  a  question  and  a  finite  set,  © ,  consisting  of  possible 
answers  to  the  question,  only  one  of  which  is  the  correct  one.  Then  for  each  $  €  ©  the 
propositions  of  interest  are  precisely  those  of  the  form  aThe  correct  answer  is  9\  ”,  a The 
correct  answer  is  $2  ”,  ...  , " The  correct  answer  is  9n  ”,  and  so  on  for  n  =  |0|  .  Simply 
stated,  a  set  is  called  a  frame  of  discernment  when  its  elements  sire  interpreted  as  possible 
answers  to  a  particular  question,  and  we  know  that  exactly  one  of  these  answers  is  correct. 
And  what  logically  follows  from  this  statement  is  that  the  set  of  all  propositions  of  interest 
are  in  a  one-to-one  correspondence  with  the  set  of  subsets  of  © ,  -  i.e.,  £P{©)  . 
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A  frame  of  discernment  is  epistemic  in  nature.  Its  meaning  and  justification  for 
existing  lies  in  the  knowledge  and  evidence  that  is  brought  to  bear  in  order  to  discern  the 
correct  answer  -  i.e.,  which  singleton  proposition  in  0  is  true.  One  will  be  able  to  identify 
the  correct  answer  to  a  question  only  to  the  degree  that  a  frame  adequately  captures  the 
relevant  interaction  of  such  knowledge. 

Consider  the  large  amount  of  information  that  is  typically  required  to  answer  any 
particular  question  in  the  real  world.  For  example,  suppose  a  robot  were  trying  to  answer 
the  question  what  is  the  object  currently  in  its  field  of  view.  A  set  of  possible  answers 
to  this  question  might  be  “A  House ”,  aA  Barn ”,  “A  Tree ”,  and  so  on.  Examples  of 
knowledge  that  might  be  brought  to  bear  on  this  question  are:  the  shape  of  houses,  bams, 
and  trees;  the  texture  of  each  object;  the  spectral  attributes  of  houses,  barns,  and  trees; 
and  perhaps  the  spatial  relationships  between  these  objects.  Each  example  just  given  is  a 
generically  distinct  type  of  knowledge  and,  by  itself,  can  be  viewed  as  a  relatively  “small 
world”  of  knowledge  compared  to  the  total  amount  that  is  usually  needed  to  answer  this 
and  more  complex  questions. 

The  propositions  contained  in  each  distinct  body  of  knowledge  might  relate  quite 
differently  to  subsets  of  the  possible  answers.  For  instance,  with  respect  to  shape,  if  the 
proposition  The  region  contains  many  lines  meeting  at  obtuse  angles  to  one  another  were 
true,  then  we  would  want  to  admit  that  it  is  possible  aA  House”  or  aA  Barn”  is  the 
correct  answer  and  that  “A  Tree”  is  not.  Similarly  with  respect  to  texture,  houses  and 
barns  might  be  relatively  less  textured  than  trees.  Then  if  the  proposition  The  region  is 
relatively  smoothly  textured  is  true  then  we  would  want  to  further  admit  that  aA  House” 
or  “A  Barn”  is  possibly  a  correct  answer  and  “A  Tree”  is  not. 

In  this  example,  each  proposition  in  each  distinct  small  world  can  be  viewed  as  a 
“feature  proposition”  (e.g.,  aThe  region  is  relatively  smoothly  textured ”)  that  might  help 
to  discern  which  answer  is  possibly  correct.  The  set  of  all  feature  propositions  in  a 
small  world  constitutes  a  “feature  space”  (e.g.,  the  texture  feature  space).  Each  feature 
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space  can  be  thought  of  as  containing  at  least  one  proposition  that  is  associated  with 
some  observable  and  quantifiable  aspect,  called  a  “feature  value” ,  of  the  related  chunk  of 
knowledge  (e.g.,  the  average  number  of  obtuse  angles  formed  by  straight  lines  in  a  region). 
The  set  of  rill  feature  spaces  of  potential  interest  constitutes  an  “environment.”  And  in 
general ,  this  includes  any  aspect  of  a  domain  or  world  about  which  information  may  be 
obtained  in  order  to  help  decide  which  answer  is  correct.  With  this  partial  background 
we  can  present  a  more  formal  description  of  a  frame  of  discernment. 

3.1.1  A  formal  view.  Let  the  set  of  mutually  exclusive  and  exhaustive  possible 
interpretations  of  an  image  be  represented  by  the  set  Qq  ,  where 

©Q  =  {01,  02,  -  0n}-  (4) 

We  can  associate  with  each  0t- ,  1  <  t  <  n  ,  a  proposition  that  represents  an  interpretation 
-  e.g.,  might  be  associated  with  the  proposition  The  image  is  a  house  scene,  $2  niight 
be  associated  with  the  proposition  The  image  is  a  tree  scene,  or  9 3  might  be  associated 
with  the  proposition  The  image  is  a  house  and  tree  scene,  and  so  on. 

Let  F\,  F2, . . . ,  Fm  correspond  to  the  feature  spaces  of  interest  -  e.g.,  F\  might 
correspond  to  the  spectral  features  of  objects,  F2  might  correspond  to  the  texture  features 
of  objects,  and  so  on.  Associated  with  each  F^ ,  for  1  <  i  <  m  ,  is  a  set  7{  of  possible 
feature  values  of  F±  , 

7{  =  {/*  |  /*  is  a  possible  feature  value  of  F,-,  for  1  <  k  <  |5t|}-  (5) 

For  example,  if  7\  is  the  set  of  feature  values  for  the  texture  feature  space  Fi  ,  then 
/*  may  correspond  to  a  relatively  smooth  texture  value,  as  might  be  characteristic  of 
objects  such  as  sky  or  paved  road.  Similarly,  may  correspond  to  a  relatively  rough 
texture  value,  as  might  be  characteristic  of  objects  such  as  tree  crowns,  or  grass.  Like  each 
f?t- ,  we  can  also  associate  with  every  /*  a  proposition  that  describes  a  possible  outcome 
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as  a  result  of  performing  a  perceptual  operation-e.g.,  might  be  associated  with  the 
proposition  The  image  contains  relatively  rough  textured  regions ,  }\  might  be  associated 
with  the  proposition  The  image  contains  relatively  smooth  textured  regions,  or  might 
be  associated  with  the  proposition  The  image  contains  both  relatively  rough  and  smooth 
textured  regions,  and  so  on. 

For  each  £  7{  it  is  possible  to  identify  a  subset  of  Qq  that  possibly  contains  the 
correct  interpretation  when  /*  is  observed.  For  instance,  let  0q  be  defined  as  follows: 

Qq  =  {  tree  crown,  sky,  grass,  paved  road}.  (6) 

Let  and  /p  correspond  to  the  texture  feature  values  just  discussed.  If  observations  by 
a  texture  KS  indicate  that  the  proposition  The  image  contains  relatively  smooth  textured 
regions  (i.e.,  )  is  true,  then  it  is  possible  that  the  region  of  interest  should  be  labeled  sky, 

or  paved  road  and  should  not  be  labeled  tree  crown,  or  grass.  Conversely,  if  observations 
by  a  texture  KS  indicate  that  the  proposition  The  image  contains  relatively  rough  textured 
regions  (i.e.,  /*  )  is  true,  then  it  is  possible  the  measured  region  should  be  labeled  tree 
crown,  or  grass  and  should  not  be  labeled  sky,  or  paved  road. 

Given  that  we  can  identify  a  subset  of  Qq  that  is  possibly  the  correct  answer  for  a 
question  of  interest  when  a  feature  /*  £  7i  is  observed,  The  set  Qq  can  be  generated 
by  a  characteristic  set  function  that  is  defined  over  the  space  of  feature  propositions  of 
interest  -  i.e.,  the  ffs  £  7{S.  That  is,  we  can  define  a  distinct  function  Xi  over  each  7{ 
to  be: 

Xi  ■  ?i  P(®q),  (7) 


such  that 


U  %•(/?) -eg. 

f?e?i 


(8) 
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Xi  is  called  the  characteristic  set  function  of  7{ ,  and  x(/*)  is  called  the  characteristic 
set  of  /*  .  An  example  characteristic  function  for  our  example  above  might  be: 

1)  X{The  region  is  relatively  smoothly  textured)  —  {sky,  paved  road}  ; 

2)  X{The  region  is  relatively  rough )  —  { tree  crown,  grass }  . 

A  frame  of  discernment,  then,  can  be  defined  in  terms  of  a  set  ©q  and  a  collection  of 
feature  spaces  and  their  characteristic  set  functions. 

A  frame  is  said  to  be  internally  complete  if  every  element  of  ©q  can  be  realized  as 
an  intersection  of  characteristic  sets.  Note  that  if  perfect  information  is  available  (i.e., 
every  feature  proposition  is  either  true  or  false)  and  ©q  captures  the  relevant  interaction 
of  our  knowledge  and  available  evidence,  then  drawing  inferences  over  ©q  amounts  to 
computing  set  intersections.  For  instance,  with  respect  to  the  question  which  proposition, 

9  G  &q  ,  is  true: 

1)  if  is  true  then  the  correct  answer  (i.e.,  the  proposition  9  <=  &q  that  is  possibly 
true)  lies  in  Xl  (/*)  ; 

2)  if  /*  is  true  then  the  correct  answer  lies  in  X2{f%  )  • 

And  as  a  consequence,  the  combined  correct  answer  is  contained  in 

xi(A*)nX2(/2*').  (9) 


3.2  CONVEYING  OPINIONS 

In  this  scheme  a  KS  might  convey  its  opinion  about  the  degree  to  which  it  believes 
feature  propositions  are  true  or  false  through  the  following  “mass  function”  M : 


M  :  P(©q)  >-*  [0,1],  where  Af(0)  =  0, 


(10) 
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and 

E  =  !•  (n) 

AC0Q 

■X 

We  note  here  that  a  Bayesian  probability  distribution  m : 

m  :0(3  i-+  [0,1],  where  E  m(®)  ~  1>  (12) 

e&eQ 

is  just  a  special  case  of  a  mass  function.  And  that  both  M  and  m  satisfy  the  conditions 
of  equation  3,  and  are  belief  functions  over  ©g  .  The  implication  of  this  statement  is 
that  if  a  complete  probability  specification  is  available,  then  the  DS  theory  is  capable  of 
integrating  this  information  with  other  bodies  of  evidence. 

Each  body  of  evidence  induces  an  interval,  called  an  “evidential  interval”,  within 
which  belief  about  a  proposition  must  lie.  An  evidential  interval  is  a  subinterval  of  the 
real  interval  [0,1].  The  lower  and  upper  bounds  of  the  evidential  interval  shall  be  called 
the  support  {Spt)  and  plausibility  (Pis) ,  respectively.  The  Spt  represents  the  total 
mass  that  tends  to  support  a  proposition: 


Spt(B)  =  E  M(p).  (13) 

pCB 

The  Pis  represents  the  degree  to  which  the  mass  fails  to  refute  the  proposition. 


Pls(B)  =  1  -  Spt(^B)  =  1  -  E  M(P)-  (14) 

pC- <B 


The  degree  to  which  the  mass  refutes  a  proposition  is  called  the  dubiety  (Dbt) . 


Dbt(B) 


Spt(->B ) 


1  -  Pls(B). 


(15) 
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The  degree  to  which  no  mass  tends  to  support  a  proposition  or  its  negation  is  called 
ignorance  ( Igr ) . 


Igr{B)  =  Pls(B)  -  Spt(B).  (16) 

The  interpretations  of  some  evidential  intervals  are  summarized  below: 

Completely  true  proposition  [l,  l]; 

Completely  false  proposition  [0,0]; 

Completely  ignorant  about  the  proposition  [0,1]; 

Tends  to  support  the  proposition  [5p£,  1],  0  <  Spt  <  1; 

Tends  to  refute  the  proposition  [0,  Pis ] ,  0  <  Pis  <  1; 

Tends  to  both  support  and  refute  the  proposition  [£p£,P/s],  0  <  Spt  <  Pis  <  1. 


Note  that  with  a  mass  function  a  KS  is  able  to  express  its  beliefs  at  any  desired 
precision  or  certainty  -  i.e.,  a  source  can  express  beliefs  by  attributing  any  amount  of  mass 
to  any  proposition  it  desires.  In  addition,  an  evidential  interval  allows  one  to  distinguish 
disbelief  from  no  belief,  unlike  a  pure  point  probabilistic  representation.  And  finally,  an 
evidential  interval  provides  a  single  formal  and  uniform  representation  of  ignorance  -  i.e., 
the  interval  [0,  l]  always  represents  total  ignorance  across  model  variations. 
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3.3  COMBINING  BELIEF  fMASSl  FUNCTIONS 

Given  the  complexity  of  the  real  world,  it  is  unlikely  that  a  single  source  of  information 
will  be  capable  of  providing  independent  beliefs  (i.e.,  opinions)  about  the  truthfulness  or 
falseness  of  feature  propositions.  Rather,  a  more  pragmatic  approach  is  to  have  multiple 
distinct  KSs  express  opinions  about  their  perceptions  by  attributing  a  portion  of  their  unit 
mass  to  feature  propositions.  In  this  approach,  we  need  to  be  able  to  form  a  consensus 
opinion  by  combining  multiple  mass  functions.  In  the  theory  of  belief  functions,  the  tool 
for  carrying  out  this  pooling  process  is  Dempster’s  rule  [7]. 

3.3.1  Dempster’s  rule.  Dempster’s  rule  tells  us  how  to  take  two  mass  distributions 
Mi  ,  M2  and  produce  a  third  mass  distribution  M3  that  represents  a  consensus  opinion 
of  two  distinct  sources,  and  is  defined  to  be: 


For  all  BuB2,B3CeQ)  M3{B3)  =  (1  -  K)~l  Mi(R1)M2(52), 

(17) 

for  K  =  Mi(Bi)M2{B2)  <  1, 


where  K  is  the  total  amount  of  conflict  between  Mi  and  M2  ,  and  (l  —  K)  1  is  a 
renormalization  factor. 

Using  Dempster’s  rule  to  combine  mass  distributions  accomplishes  three  functions. 
The  first  is  to  obtain  a  consensus  about  what  answer  each  source  believes  is  possibly 
correct.  If  both  opinions  are  completely  consistent,  there  is  at  least  one  answer  that  both 
sources  agree  is  correct,  and  it  can  be  said  they  are  expressing  totally  compatible  opinions 
-  i.e.,  K  —  0 .  Conversely,  if  beliefs  are  not  completely  consistent,  then  their  opinions  are 
not  totally  compatible  and  there  is  at  least  one  answer  the  sources  disagree  is  appropriate- 
i.e.,  0  <  K  <  1 .  In  general,  to  the  degree  that  sources  are  certain,  precise,  and  accurate 
with  respect  to  their  observations,  their  opinions  about  which  answers  are  correct  will 
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be  compatible.  Dempster’s  rule  determines  simultaneously  if  there  is  any  0  G  Qq  that 
multiple  sources  agree  is  true  and  provides  a  measure  of  compatibility  among  the  opinions 
they  provide. 

The  second  function  of  Dempster’s  rule  is  to  correct  for  minor  errors.  The  assumption 
here  is  that  there  is  only  a  negligible  likelihood  that  distinct  sources  might  introduce  the 
same  type  of  error  into  their  opinions  simultaneously  -  i.e.,  that  they  are  stochastically 
independent.  Therefore,  such  errors  can  be  overcome  by  a  sufficient  amount  of  redundant 
and  generally  correct  beliefs.  If  a  subset  of  sources  make  gross  errors,  such  bad  information 
should  be  discounted,  when  detected. 

The  third  function  of  Dempster’s  rule  is  to  compute  the  minimum  degree  of  support 
that  should  be  attributed  to  compatible  opinions,  if  such  an  opinion  exists.  In  a  sense, 
the  multiplicative  nature  of  equation  17  computes  the  minimum  commitment  of  support 
one  should  attribute  to  compatible  opinions  that  were  provided  from  independent  sources. 
But  the  requirement  that  sources  be  independent  is  crucial  to  the  applicibality  of  the  rule 
and  is  discussed  in  the  following  section. 

3.3.2  Independence.  Consider  the  following  excerpt  from  the  paper  that  describes 
the  independence  requirements  of  Dempster’s  rule  [7]. 


"...  Opinions  of  different  people  based  on  overlapping  experience  could  not  be  regarded  as 
independent  sources.  Different  measurements  by  different  observers  on  different  equipment 
would  often  be  regarded  as  independent,  but  so  would  different  measurements  by  one 
observer  on  one  piece  of  equipment:  here  the  question  concerns  independence  of  errors.” 


Our  reason  for  presenting  this  excerpt  is  to  emphasize  that  the  independence  constraint 
that  must  be  satisfied  before  Dempster’s  rule  is  potentially  applicable  is  with  respect  to 
the  errors  multiple  sources  might  make.  And  despite  that  fact  that  this  point  has  been 
made  in  the  mathematical  literature,  it  has  escaped  recognition  by  a  significant  portion 
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of  the  artificial  intelligence  community  that  relies,  to  some  degree,  on  the  DS  calculus  for 
reasoning  from  evidential  information. 

This  notion  of  independence  is  quite  different  from  that  which  typically  comes  to  mind 
during  discussions  of  classical  probabilistic  models  for  pooling  information.  The  classical 
definition  of  stochastic  independence  for  n  events,  E ,  is  defined  as: 

nf]Ei)  =  f[P(Ej).  (18) 

;'=i  3=1 

But  we  must  be  careful  when  we  try  to  interpret  this  equation  in  the  context  of  “  . . .  here 
the  question  concerns  independence  of  errors.”,  [7j.  This  is  so  because  both  the  Bayesian 
theory  and  belief  function  theory  treat  chance  in  different  ways,  and  as  a  consequence 
their  concepts  of  independence  are  slightly  different. 

Consider  two  propositions  A,B  C  ©q  that,  for  the  moment,  happen  to  be  false  with 
respect  to  a  particular  frame  and  bodies  of  evidence.  Now  let  E ^  be  the  event  that  KS  ^ 
attributes  a  non  zero  amount  of  mass  in  support  of  A .  And  let  E{  be  the  event  that 
KS  t-  attributes  a  non  zero  amount  of  mass  in  support  of  B ,  where  A  n  B  ^  0  and 
1  <  i  ^  k  <  n .  That  is,  for  a  particular  frame  both  KSs  have  simultaneously  errored 
in  their  assessment  of  some  body  of  evidence.  Then  P{Eji  n  E{)  is  interpreted  as  the 
probability  or  chance  that  both  KS  f.  and  KS  {  will  simultaneously  express  opinions  that 
are  compatible  and  erroneous. 

If  a  frame  of  discernment  has  taken  into  account  all  significant  dependencies  then  the 
left  hand  side  of  equation  18  will,  as  a  consequence,  be  zero.  Also  notice  that  this  does 
not  mean  that  all  sources  must  be  error  free  in  order  for  this  equality  relation  to  hold.  It 
is  only  necessary  that  at  least  one  P{Ej)  —  0 . 

Under  less  than  ideal  conditions  we  need  to  augment  the  meaning  of  an  event  Ej 
slightly.  The  reason  is  because  noise  is  random  in  nature,  and  as  a  consequence  we  must 
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anticipate  that  equation  18  will  not  always  be  zero.  Thus  a  more  realistic  concern  is  with 
the  probability  or  chance  distinct  sources  will  simultaneously  introduce  errors  above  some 
noise  threshold,  say  t ,  into  their  opinions.  Now  we  can  interpret  an  event  to  mean 
that  KS  f.  will  attribute  an  amount  of  mass  (i.e.,  support)  mi(A)  >  uj. ,  given  that  a 
source  KSt-  will  attribute  an  amount  of  mass  rri2(B)  <  where  0  <  t  <  u,-,  vf.  <  1  and 
AnB  7^  0  .  Then  P(B/.)  can  be  interpreted  as  the  a  priori  probability  or  chance  that  KS  * 
will  introduce  errors  larger  than  vf.  into  its  assessment  of  some  body  of  evidence.  That 
is,  we  are  only  concerned  with  the  chance  that  independent  sources  will  simultaneously 
introduce  errors  above  some  “noise”  level  in  their  opinions.  We  have  shown  how  one  can 
determine  the  maximum  amount,  vk  for  instance,  of  mass  that  KSj.  can  attribute  to  a 
false  proposition,  say  A  ,  and  keep  the  total  amount  of  Spt(A  fl  B )  below  some  level  s  , 
given  that  KS  ,•  attributes  an  amount  of  mass  in  support  of,  say  B  ,  below  ut-  [44]. 

With  respect  to  both  the  Bayesian  and  DS  technologies,  as  well  as  many  others,  one 
tries  to  make  the  relevant  dependencies  of  the  current  problem  effectively  independent 
within  the  context  and  constraints  of  each  theory.  Accomplishing  this  makes  the  machinery 
of  each  model  potentially  appropriate  to  use.  With  this  background  we  are  now  ready  to 
acquaint  the  reader  with  the  concept  of  evidential  reasoning. 

4  EVIDENTIAL  REASONING 

The  concept  of  evidential  reasoning  (ER)  was  introduced  by  Lowrance  and  Garvey 
[25].  This  evolving  technology  starts  from  the  position  that  the  acquisition  of  information 
by  KBSs  involves  making  imperfect  perceptions  of  the  environment.  A  KBS  “understands” 
its  world  by  perceiving  it  through  a  set  of  KSs.  And  because  a  system’s  perceptual 
machinery  is  not  flawless,  it  follows  that  the  information  the  KSs  provide  will  be  to  some 
degree  uncertain,  imprecise,  and  occasionally  inaccurate  -  evidential  in  nature.  This 
concept  currently  relies  on  the  DS  formalism  as  its  model  for  representing  and  pooling 
KSs’  beliefs  that  are  based  on  their  environmental  perceptions.  Thus  the  DS  formalism  is 
fundamental  to  ER-based  models  that  KBSs  might  use  to  reason  in  their  task  domain. 
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There  are  two  distinct  reasoning  processes  that  must  be  completed  in  this  concept. 
One  is  to  take  a  single  body  of  evidence  and  propagate  its  effect  from  those  propositions 
the  evidence  bears  directly  upon  to  those  it  indirectly  bears  upon.  This  allows  inferences 
to  be  drawn  about  those  propositions  not  directly  affected  by  the  evidence.  This  process  is 
typically  carried  out  by  what  is  commonly  called  an  inference  engine.  The  other  process, 
one  that  pools  multiple  bodies  of  evidence  into  a  single  body  of  evidence  that  represents 
a  consensus  opinion,  is  Dempster’s  rule  which  we  have  already  described. 

We  can  summarize  these  processes  in  terms  of  a  KBS’s  two  computational  require¬ 
ments,  which  for  B,C  C  Qq  are: 

1. )  Combination  of  multiple  M ’s: 

a)  Apply  Dempster’s  rule  to  M\  and  to  produce  a  consensus  opinion  that 
is  reflected  in  M3  =  M\  ©  M2  . 

b)  If  B  n  C  7^  0  ,  then  add  M\(B)  *  M^C)  to  current  Mz{B  n  C)  . 

c)  If  B  n  C  =  0  ,  then  add  to  current  k  . 

2. )  Extrapolation:  Taking  the  result  of  Dempster’s  rule  (i.e.,  M3 )  and  computing 

the  Spt  and  Pis  of  the  remaining  dependent  propositions. 

a)  If  B  C  C  then  add  M(B)  to  current  Spt(C)  . 

b)  If  B  C  -1 C  then  add  M(B)  to  current  1  —  Pls(B)  . 

With  this  fairly  extensive  discussion  of  Dempster’s  rule,  Shafer’s  theory,  and  the  concept 
of  evidential  reasoning,  we  can  now  introduce  our  evidential-based  high-level  computer 
vision  system  (EHCVIS).  After  which  we  shall  show  how  we  have  used  both  the  DS  and 
ER  technologies  to  reason  in  our  task  domain. 
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5  EHCVIS 

A  flow  diagram  of  EHCVIS  and  a  generalized  illustration  of  its  architecture  is  shown 
in  Figure  3.  We  do  not  claim  that  the  architecture  of  our  system  is  ideal  for  a  high-level 
computer  vision  system  -  see,  for  instance,  Levine  for  a  discussion  of  a  general  design  for 
computer  vision  systems  [24] .  Rather,  the  design  we  have  chosen  is  one  of  perhaps  many 
that  might  be  adequate  for  interpreting  images  and  exploring  various  aspects  of  the  DS 
theory. 

As  indicated  by  the  figure,  EHCVIS  can  be  described  in  four  phases.  The  task  of  the 
first  phase  is  to  use  the  specifications  of  goals  to  help  complete  two  subtasks.  Examples 
of  goals  the  system  might  try  to  reach  are  finding  a  house,  locating  the  ground  plane,  or 
obtaining  additional  information  to  help  resolve  some  ambiguity  the  system  might  have 
about  the  identity  of  objects  in  a  region  of  interest.  The  first  subtask  is  to  use  goal 
specifications  to  generate  a  set  of  alternative  actions  the  system  might  pursue  in  order  to 
reach  that  goal.  The  second  subtask  is  to  use  goal  specifications  to  select  control  strategies 
that  will  be  used  to  help  decide  which  alternative  action  is  more  appropriate  to  pursue. 

The  second  phase  can  be  summarized  in  several  steps:  (1)  with  the  alternative  actions 
and  control  strategies  that  were  selected  in  the  previous  phase,  dynamically  build  the 
control  knowledge  (i.e.,  0^ )  that  will  be  brought  to  bear  on  the  problem  of  deciding  which 
alternative  to  pursue;  (2)  implement  these  control  strategies,  in  part,  by  obtaining  control 
related  information  from  independent  control  knowledge  sources  (CKSs);  and  (3)  pool 
these  beliefs  using  Dempster's  rule  and  then  use  an  inference  engine  to  take  the  result  of 
Dempster’s  rule  and  infer  which  action  is  the  best  to  pursue.  Note  that  in  the  third  step 
of  the  second  phase  (see  Figure  3)  beliefs  are  pooled  and  inferences  are  drawn  over  the 
frame  denoted  by  0^  .  This  is  to  indicate  that  the  DS  technology  is  used  by  our  system 
to  reason  about  its  actions. 
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In  the  third  phase,  our  system  takes  the  action  suggested  by  the  second  phase.  A 
typical  action  might  be  to  task  a  subset  of  available  KSs  to  make  some  observation  about  a 
particular  subset  of  regions  in  an  image,  then  express  some  beliefs  about  their  perceptions. 
After  the  KSs  have  done  this,  Dempster’s  rule  is  used  to  pool  their  beliefs  and  then 
inferences  are  drawn  over  Qq  to  infer  which  propositions  (i.e.,  label  hypotheses)  in  Qq 
should  be  associated  with  the  region  under  examination. 

In  the  last  phase,  the  results  of  the  inferences  drawn  over  Qq  are  evaluated.  Based 
on  this  evaluation  the  system  might  decide  that  a  new  goal  should  be  satisfied  and  return 
to  the  goal  generation  phase.  Or  that  the  interpretation  process  should  be  terminated. 
Or  that  the  system  should  “instantiate”  (i.e.,  record  in  a  dynamic  representation  called 
short  term  memory,  STM)  its  belief  that  a  subset  of  the  label  hypotheses  in  Qq  should  be 
associated  with  a  subset  of  the  regions  in  an  image,  and  then  set  new  goals  to  be  satisfied. 
Let  us  briefly  discuss  each  phase  in  more  detail. 

5.1  PHASE  ONE; 

5.1.1  Goals.  EHCVIS  begins  the  interpretation  process  when  a  goal  is  placed 
on  a  goal  stack.  Every  goal  contains  three  parts:  (1)  a  symbol  that  indicates  a  goal- 
name  or  goal-identification.  For  example,  veriiy-kss,  and  reduce-ignorance-about- 
a-hypothesis  are  examples  of  symbols  that  indicate  the  goal  of  verifying  the  preception 
of  KSs,  and  indicate  the  goal  of  reducing  the  system’s  ignorance  about  the  truthfulness  or 
falseness  of  a  label  hypothesis,  (2)  a  specification  of  a  set  of  KS  selection  constraints;  and 
(3)  a  specification  of  a  set  of  region  selection  constraints.  The  KS  selection  constraints 
specify  attributes  of  KSs  that  must  be  satisfied  before  the  system  will  consider  them  as 
potential  sources  of  information.  Similarly,  the  region  selection  constraints  specify  charac¬ 
teristics  that  regions  must  possess  before  the  system  will  consider  obtaining  information 
about  them.  For  instance,  when  the  system  begins  the  interpretation  process  it  is  totally 
ignorant  about  the  identity  of  objects  that  are  depicted  in  an  image.  One  “start-up” 
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goal  might  be  to  obtain  preliminary  information  about  a  subset  of  the  regions  in  the  im¬ 
age.  It  might  be  desirable  to  satisfy  this  goal  by  tasking  the  most  reliable  KSs  to  obtain 
information  about  “unusual”  regions  -  i.e.,  regions  exhibiting  features  that  might  cause, 
say  a  human,  to  foveate  to  upon  initial  examination  of  an  image.  An  example  of  how  such 
a  goal  and  its  selection  constraints  might  be  specified  is  shown  in  Figure  4.  The  symbol 
start-up,  in  Figure  4  indicates  that  the  system  should  try  and  reach  the  goal  of  obtain¬ 
ing  preliminary  information  about  some  region  in  an  image.  The  symbol  rel  in  the  list 
(rel  0.7  1)  of  the  KS  selection  constraint  portion  of  the  start-up  goal  indicates  that 
this  constraint  pertains  to  the  reliability  of  KSs.  And  the  interval  (. .  .  0.7  1)  indicates 
the  range  within  which  the  reliability  of  a  KS  must  lie  before  it  is  considered  a  potential 
source  of  information.  How  the  reliability  of  KSs  might  be  determined  is  not  of  interest 
at  the  moment  and  is  discussed  in  more  detail  in  [44]. 

Upon  initiation  of  the  interpretation  task  it  might  be  desirable  to  focus  attention  on 
unusual  regions  -  e.g.,  regions  that  are  relatively  large,  relatively  bright  or  dark,  or  at 
some  extreme  location  in  an  image.  Suppose  our  system  is  initially  interested  in  relatively 
large  regions  at  a  relatively  high  location  in  an  image.  The  following  region  selection 
constraint  might  be  used  to  specify  this  interest: 

(conj  ((loc-above  x  y) 

(size  min-size  max-size)  ...)). 

Where  (conj  ((loc-above  ...)  (size  ...)  ...))  means  a  region  becomes  a  poten¬ 
tial  candidate  for  examination  if  its  location  is  above  some  minimum  x  y  position  in  the 
image  and  its  size  is  within  the  range  (. .  .  min-size  max-size).  The  shaded  region 
of  the  image  in  Figure  4  exemplifies  the  result  of  applying  a  similar  constraint  to  an  image 
that  has  been  interpreted  by  our  system. 


22 


5.1.2  Generating  Alternatives.  The  output  from  the  KS  and  region  selection 
process  is  a  list  of  KSs  to  possibly  task  and  a  set  of  regions  in  the  image  these  KSs  might 
be  asked  to  obtain  information  about.  It  might  be  necessary  or  desirable  to  task  multiple 
KSs  from  the  list  of  candidate  KSs  -  e.g.,  KS  i  A  KS  2  -  on  a  collection  of  regions  -  e.g., 
R$  U  R50  .  If  we  let  k  and  r  represent  the  list  of  selected  KSs  and  regions  respectively, 
the  system  generates  the  sets: 

KcfP(k)  and  RQP{r),  (19) 

A  typical  k  e  K ,  might  be  the  set  {KSi,  KS2}  ,  and  should  be  interpreted  to  mean 
KS\  A  KS2  .  Similarly,  a  typical  p  6  R  ,  might  be  the  set  {R3,  R14,  Re}  ,  and  should  be 
interpreted  to  mean  iJ3U.R14U.R6  •  If  &  and  or  r  are  large,  the  pragmatics  of  generating 
K  and  R  could  be  prohibitive.  However,  in  practice  we  might  know  a  priori  that  some 
KSs  cannot  be  simultaneously  tasked,  thus  eliminating  some  possibilities.  Sometimes 
the  KS  or  region  selection  constraints  might  keep  the  size  of  k  and  r  relatively  small. 
Other  times,  however,  k  and  r  can  remain  relatively  large.  When  this  is  the  case, 
EHCVIS  randomly  choose  a  manageable  subset  of  k  and  r  to  work  with.  The  size  of  the 
subset  chosen  is  a  function  of  the  available  computational  resources.  And  we  have  pointed 
out  that  systems  must  perform  a  similar  operation  when  the  amount  of  data  becomes 
overwhelming  and  the  information  to  help  prune  the  choices  is  not  available  (43]. 

Once  K  and  R  have  been  generated  the  frame  of  discernment,  ©,4 ,  from  which 
EHCVIS  must  choose  an  alternative  is  defined  to  be: 

6^4  C  {invoke— }  x  K  x  R.  (20) 

For  example,  consider  the  following  propositions  in  ©^4  : 


©^4  =  {invoke  —  KSi  A  KS2  —  R3, 

invoke  —  KS$  A  KSi  A  KS2  —  R3  U  R14,  . . 


(21) 
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We  interpret  an  alternative,  9  6  0^4  ,  of  the  form  invoke  —  K S\  A . . .  A  KSn  —  R\  U . . .  U 
to  mean  task  KS\  and  KSi  and  . . .  and  KSn  to  simultaneously  obtain  information 
from  the  region  formed  by  U  i?2  U  . . .  U  i?*  -  i.e.,  the  region  formed  by  the  union  of 
i?l  and  i?2  and,  . . . ,  and  R k  .  But  given  a  set  of  alternatives,  what  control  strategies 
might  a  system  employ  to  help  decide  which  alternative  to  pursue? 

5.1.3  Control  strategies.  EHCVIS  has  eleven  “primitive”  control  strategies  which 
can  be  used  to  help  decide  which  alternative  action  to  pursue.  One  primitive  control 
strategy  is  to  obtain  information  in  support  of  or  against  hypotheses  for  which  the  system 
is  most  ignorant  about.  This  strategy  might  be  used  to  reduce  the  system’s  ignorance 
in  the  truthfulness  or  falseness  of  a  label-hypothesis.  A  second  strategy  is  to  obtain 
information  that  will  help  to  reduce  the  system’s  ambiguity  about  the  truthfulness  or 
falseness  of  a  subset  of  label  hypotheses  in  0q  .  More  complex  control  strategies  are 
formed  by  “merging”  two  or  more  primitive  control  strategies.  The  details  of  this  merging 
process  will  be  discussed  shortly. 

Currently,  EHCVIS  uses  a  simple  “table-driven”  scheme  to  decide  which  control 
strategies  should  be  used.  For  each  goal  the  system  is  expected  to  reach,  there  is  an 
entry  in  a  table  that  lists  a  subset  of  the  available  primitive  control  strategies  that  should 
be  used  to  help  reason  about  what  action  to  pursue.  For  example,  with  respect  to  the 
start-up  goal,  EHCVIS’s  table  currently  indicates  that  two  primitive  control  strategies 
should  be  simultaneously  used:  the  strategy  of  invoking  the  most  reliable  KS,  and  the 
strategy  of  obtaining  information  about  hypotheses  the  system  is  most  ignorant  about. 

A  more  complex  strategy  is  specified  by  enumerating  two  or  more  primitive  control 
strategies  in  this  table.  Now  let  describe  how  they  are  effectively  implemented  within  our 
control  framework. 
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5.2  PHASE  TWO; 

5.2.1  Building  Qa.  Each  primitive  control  strategy  can  be  viewed  as  a  “control 
feature  space.”  Each  control  strategy  is  associated  with  a  small  world  of  control  knowledge 
that  might  be  brought  to  bear  on  the  question  of  which  action  to  take.  Suppose  we  let 
F\  represent  the  control  feature  space  that  is  related  to  the  reliability  of  KSs.  Then 
within  the  context  of  the  DS  theory,  we  must  enumerate  the  set,  7\  ,  of  control  feature 
propositions  that  are  associated  with  F\ ,  and  then  define  the  characteristic  function: 

Xl  :  7\  ~  P(®a).  (22) 

EHCVIS  enumerates  7{  s  and  constructs  Xi 3  dynamically  because,  unlike  ©q  ,  the  set 
of  alternative  actions  a  system  might  pursue,  as  represented  by  ©^  ,  typically  cannot  be 
know  a  priori. 

EHCVIS  dynamically  builds  ©^  in  the  following  manner.  Suppose  aj  ,  a.^  ,  and  a$ 
correspond  to  the  following  alternatives  in  ©^  :  * 

©yl  =  {inuofce  —  KS\  A  KS%  —  R 3  (ai), 

invoke  —  KS$  A  KS\  A  KS 2  —  R3  U  i?i4  (02))  (23) 

invoke  —  KS$  A  KS\q  —  .R14  (<13)}. 

Then  the  following  control  feature  propositions  of  7\  will  be  dynamically  constructed: 

/■J  :  The  most  reliable  KSs  are  KS\  and  KSi  ; 

f\  :  The  most  reliable  KSs  are  KS$  and  KS\  and  KS 2 ; 

:  The  most  reliable  KSs  are  KS$  and  KS\q  . 

The  reason  these  particular  propositions  have  been  enumerated  is  that  prior  to  obtaining 
any  information  about  the  reliability  of  KSs,  it  is  possible  the  KSs  specified  in  each 
alternative  might  actually  be  the  most  reliable.  Therefore  each  e  7\  ,  where  1  < 
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k  <  1 7\  |  ,  reflects  this  possibility,  and  it  is  the  task  of  a  CKS  to  dynamically  measure  this 
control-feature  and  then  express  its  belief  about  which  action,  if  taken,  would  result  in 
tasking  the  most  reliable  KSs. 

The  function  Xl  can  be  defined,  in  words,  to  be  as  follows.  For  each  /*  E  7\  extract 
the  KSs  the  control-feature  proposition  claims  is  the  most  reliable  -  e.g.,  /j3  claims  KSs 
and  KSio  are  the  most  reliable.  Then  include  in  the  characteristic  set  of  each  those 
actions  that  specify  the  same  set  of  KSs.  For  example,  Xl(/i)  =  {al}  >  Xl(/i)  =  {02}  , 

and  xi(/f)  =  {a3>- 

If  there  were  a  fourth  alternative,  say  04  E  ,  that  was  defined  to  be  invoke  —  KS\  A 
K S2  —  i?3  U  R\  then  the  characteristic  set  of  would  be  Xl(/i)  =  {al>  04}  ■ 

If  our  reliability  CKS  believes  that  KSs  and  KS\  and  KS2  are  the  most  reliable, 
then  it  may  express  this  belief  by  attributing  a  portion  of  its  unit  mass  in  support  of  . 
Attributing  more  mass  to  the  more  it  believed  to  be  true,  and  less  mass  the  less 
it  believed  f\  was  true.  Or  alternatively,  attributing  more  mass  to  the  more  it 

believed  f\  was  not  true. 

Similarly,  if  the  same  CKS  believes  that  KSs  and  KS\  and  KS^  or  KSs  and 
KS 10  are  the  most  reliable,  then  it  may  convey  this  opinion  by  assigning  a  portion  of  its 
unit  mass  in  support  of  the  disjunction  V  .  Our  reliability  CKS  may  express  total 
ignorance  about  which  KSs  it  believes  are  the  most  reliable  by  assigning  all  of  its  unit 
mass  to  the  disjunction  /f1  V  f\  V  -  i.e.,  ©^  . 

Now  consider  a  second  primitive  control  strategy  of  obtaining  information  about  re¬ 
gions  that  the  system  has  the  least  information  about.  Again,  associated  with  this  strategy 
is  a  control  feature  space,  say  F2  ,  and  a  related  set  of  control  feature  propositions  ■  I11 
this  case,  each  /*  £  ^2  would  be  of  the  form,  for  example,  The  region  the  system  knows 
the  least  about  is  R3  ,  and  so  on  for  1  <  k  <  |?2| .  Next  the  system  would  define  X2 
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in  a  similar  manner  as  it  defined  xi  except  that  it  would  be  with  respect  to  the  regions 
specified  in  each  feature  proposition  and  alternative.  And  our  “region  ignorance”  CKS, 
like  our  reliability  CKS,  would  be  free  to  express  any  degree  of  support  for  or  against  any 
proposition  or  disjunction  of  propositions  it  desires.  Now  the  frame  in  this  simple 
example  is  defined  to  be: 

2 

©A  =U  *•(/*)•  (24) 

»=1 

And  as  a  consequence  of  defining  more  than  one  charact eristic  function  with  respect  to  the 
same  frame  of  discernment,  a  more  complex  control  strategy  has  effectively  been  defined. 
That  is,  the  strategy  of  tasking  the  most  reliable  KSs  on  the  regions  the  system  is  most 
ignorant  about. 

If  a  frame  of  discernment  is  the  mechanism  by  which  complex  control  control  strategies 
are  defined.  Dempster’s  rule  is  the  machinery  by  which  they  are  effectively  implemented. 
If  our  reliability  CKS  believes  the  proposition  /*  V  f?  is  true  and  our  region  ignorance 
CKS  believes  the  proposition  /|  V  is  true  then  Dempster’s  rule  determines  if  there 
exists  an  alternative  action  that  both  CKSs  agree  is  appropriate  to  take.  This  is  accom¬ 
plished  by  intersecting  the  characteristic  sets  of  the  two  propositions.  In  this  instance,  the 
action  a%  is  the  only  alternative  both  CKSs  agree  the  system  should  pursue.  Thus,  we 
have  implemented  the  more  complex  strategy  of  tasking  the  most  reliable  KSs  on  regions 
the  system  is  least  knowledgeable  about.  In  cases  where  a  consensus  opinion  does  not 
exist,  Dempster’s  rule  informs  the  system  of  this  via  its  conflict  measure  k .  When  k 
becomes  relatively  large,  a  system  must  consider  four  possible  causes:  (1)  one  or  more 
CKS  expressed  inaccurate  opinions;  (2)  the  frame  is  incomplete  (i.e.,  some  alternatives 
are  missing  from  the  model);  (3)  the  goals  are  not  satisfiable;  or  (4)  a  combination  of  the 
previous  three. 
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5.2.2  How  CKSa  make  measurements  and  convey  beliefs.  Consider  two  dis¬ 
tinct  types  of  control  related  information  that  CKSs  might  obtain: 

1)  IGNOR.ANCE-(  Igr ):  the  total  amount  of  evidence  that  neither  supports  nor 
refutes  the  truthfulness  of  a  label  hypothesis. 

2)  AMBIGUITY-(  Amb ):  the  total  amount  of  evidence  that  fails  to  support  or 
refute  choosing  a  label  hypothesis  over  its  negation. 

We  defined  the  ignorance  measure  of  a  proposition,  say  p  G  ©q  ,  in  equation  16.  The 
ambiguity  measure  for  a  proposition,  e.g.,  p  G  ©q  ,  is  defined  as: 

f  -Pfs(p)  -  Spt(-ip),  for  Pls(p)  >  Spt(-^p); 

Amb[p )  =  j  Pfs(-ip)  —  Spt(p)y  for  F/s(-ip)  >  <?pt(p);  (25) 

{ 0,  otherwise. 

In  words,  it  is  a  measure  of  the  amount  of  overlap  of  the  evidential  intervals  of  a  proposition 
p  and  its  negation,  and  thus  represents  the  evidence  that  does  not  help  to  support  or 
refute  p. 

Now  that  we  know  how  some  CKSs  can  measure  the  ignorance  and  ambiguity  of 
propositions,  how  do  they  choose  which  control  feature  propositions,  i.e.,  /*  G  ,  to 
support,  and  how  much  to  support  it?  In  EHCVIS,  the  specification  of  each  KS  contains 
a  list  of  feature  propositions,  /*  C  ©q  it  can  possibly  express  some  opinion  about.  Just 
as  some  sensors  in  the  real  world  can  only  perceive  certain  bandwidths  of  energy,  some 
KSs  can  only  observe  certain  features.  As  a  consequence,  each  KS  i3  capable  of  discerning 
only  a  subset  of  the  label  hypotheses  of  interest.  A  CKS  uses  the  information  in  a  KS’s 
specification  to  help  decide  how  much  support  to  give  to  control  feature  propositions, 
fi  —  ©  A  •  Let  us  provide  an  example  of  how  this  is  accomplished. 

Suppose  our  ambiguity  resolving  CKS  has  measured  the  ambiguity  of,  say  ten,  disjoint 
label  hypotheses  of  interest  and  determined  that  the  system  is  most  ambiguous  about 
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two  of  these  label  hypotheses,  say  P4,  pq  G  Qq  .  Consider  the  following  partial  KS 
specification  that  might  be  used  by  our  ambiguity  CKS: 

KSi  :  For  the  set  of  observable  feature  propositions  7\  =  {/£,  ff}  , 

Xl(fi)=  {P2,  P3>, 
xi(/i)=  {P5>; 

KS$  :  For  the  set  of  observable  feature  propositions  7$  =  {/j  }  , 

X5(/5X)=  {P5i  Pe}  • 

To  reduce  the  ambiguity  between  p4  and  pq  our  system  must  obtain  information  from 
KSs  that  support  either  p4  or  pg  but  not  both,  or  support  either  -774  or  -ip6  but  not 
both.  We  can  see  from  the  above  partial  KS  specifications  that  K S\  cannot  provide  such 
information.  If  K Si  supports  any  subset  of  the  feature  propositions  in  7\  then  both  P4 
and  pq  become  less  plausible.  Conversely,  P4  gains  no  support  over  pg  ,  or  vise  versa, 
if  KSi  refutes  any  subset  of  the  feature  propositions  in  7\  .  Unlike  KS\  ,  if  KSq  gives 
support  to  any  subset  of  its  feature  propositions  then  the  amount  of  ambiguity  between 
the  two  label  hypotheses  will  be  reduced.  Thus,  pursuing  those  actions  that  result  in 
invoking  KS$  is  more  appropriate  than  pursuing  those  actions  that  invoke  KS\ .  The 
manner  in  which  CKSs  compute  the  degree  to  which  any  alternative  should  be  supported 
is  discussed  in  [44].  However,  in  a  later  section  we  shall  present  a  simple  example  to 
illustrate  the  effect  of  the  methods  some  CKSs  use. 

5.2.3  Decision  criteria.  As  a  consequence  of  CKSs  expressing  their  opinions  in 
terms  of  mass  functions  M ,  an  evidential  interval  is  induced  over  the  alternatives  in  ©a  • 
Selection  of  the  appropriate  action  requires  that  these  evidential  intervals  be  evaluated. 
Although  a  complete  classical  utility  theory  for  evaluating  an  interval  representation  of 
belief  is  not  yet  available,  it  is  possible  to  choose  actions  on  the  basis  of  several  simple 
criteria.  For  instance,  the  best  action  is  obvious  for  those  alternatives  with  nonoverlapping 
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intervals.  For  those  choices  with  overlapping  intervals,  further  evaluation  is  called  for. 
There  are  many  utility-  vs.  cost-based  theories  that  might  be  used  to  select  an  action  on 
the  basis  of  beliefs  that  are  constrained  by  an  evidential  interval.  Although  the  details  of 
how  such  theories  might  be  employed  are  beyond  the  scope  of  this  paper,  we  can  describe 
the  simple  decision  measure  and  criterion  that  EHCVIS  uses. 

This  measure  is  motivated  by  the  intuition  that  we  should  choose  an  alternative  if  the 
sum  of  the  support  for  it  minus  the  sum  of  the  support  for  its  competitors  is  greater  than 
this  same  measure  for  the  remaining  alternatives.  And  the  decision  criterion  used  is  to 
pursue  the  alternative  that  is  indicated  by  the  proposition  having  the  largest  value  of  the 
above  measure.  Since  Spt  and  Dbt  represent  the  sum  of  the  support  for  and  against  a 
proposition,  respectively,  we  can  characterize  this  decision  measure  and  criteria  through 
the  following  equation: 

MAX  [Dec(a)  =  Spt{a )  -  Dbt(a)].  (26) 

For  the  case  where  this  measure  is  the  same  for  two  or  more  alternatives,  a  random  choice 
is  made. 

Unfortunately,  an  epistemological  justification  for  this  decision  criteria  cannot  be  of¬ 
fered  at  this  time.  Other  people  have  suggested  that  just  the  plausibility,  Pis,  of  an 
alternative  is  adequate  [2].  However,  our  view  is  that  further  investigation  might  re¬ 
veal  that  a  combination  of  evidential  measure  might  be  more  appropriate  under  different 
circumstances. 
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5.3  PHASE  THREE: 

5.3.1  Pursuing  an  alternative.  A  general  purpose  computer  vision  system  must 
have  a  relatively  large  and  sophisticated  set  of  KSs  in  order  to  interpret  images  of  complex 
scenes  [19].  This  is  due,  in  part,  to  the  need  for  a  variety  of  information  that  typically 
cannot  be  provided  by  a  single  source.  The  process  of  building  the  necessary  KSs  remains 
an  active  area  of  research  [45],  [19],  [11],  [12],  [22],  [5],  And  although  there  have  been  a 
number  of  significant  advances  in  the  number  and  quality  of  KS-like  feature  extraction 
procedures,  the  number  of  sufficiently  sophisticated  and  diverse  KSs  that  are  needed  to 
implement  a  general  purpose  computer  vision  system  is  not  readily  available.  Due  to  this 
lack  of  resources,  pursing  an  action  in  EHCVIS  is  accomplished  by  simulating  the  tasking 
or  invocation  of  KSs.  Let  us  describe  this  process. 

5.3.2  Simulating  the  invocation  of  KSs.  EHCVIS  has  a  pool  of  nineteen  KSs 
that  are  capable  of  providing  a  variety  of  information.  A  subset  of  these  KSs  are  typically 
called  low-level  feature  extraction  processes.  In  aggregate,  these  KSs  can  express  opinions 
about  a  region’s  texture,  spectral  properties,  two-dimensional  spatial  relationships  to  other 
regions,  and  its  polygonal  shape.  The  remaining  subset  of  KSs  are  typically  considered 
higher-level  sources  of  information,  called  object  KSs.  The  object  KSs,  in  aggregate,  can 
express  opinions  about  the  presence  or  absence,  in  a  region,  of  visual  entities  such  as  roofs, 
houses,  grass,  tree  crowns,  and  so  on.  Objects  can  be  viewed,  in  a  sense,  as  features  of 
more  complex  scenes  such  as  residential  neighborhood  scenes,  farm  scenes,  and  city  scenes. 
Just  as  objects  exhibit  certain  shape,  spectral,  and  texture  features,  so  can  complex  scenes 
exhibit  features  such  as,  houses,  roofs,  grass,  and  roads. 

Every  region  in  a  segmented  image  that  the  system  is  expected  to  interpret  contains 
nineteen  mass  functions,  one  for  each  KS.  Each  mass  function  represents  a  subjective 
estimate  of  the  best  opinion  the  corresponding  KS  can  possible  convey  if  it  were  asked  to 
extract  feature  information  from  some  region  under  examination.  These  subjective  mass 
functions  are  generated  and  stored  in  an  image  data-base  prior  to  interpretation. 
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These  mass  functions  were  derived  by  evaluating  an  empirical  and  or  theoretical  analysis 
of  the  algorithms  it  is  expected  a  real  KS  will  use  when  forming  opinions.  This  evaluation 
process  was  repeated  for  each  KS  and  for  all  the  regions  that  were  known  to  contain  a 
specific  object  the  system  might  be  expected  to  discern.  For  regions  containing  multiple 
objects  a  different  set  of  statistics  would  be  computed  and  as  a  consequence  a  different 
mass  function  would  have  been  generated  and  stored  in  the  image  data-base.  The  details 
of  the  process  are  explained  in  [44] . 

Forming  the  best  opinion  a  KS  might  convey  is  not  the  objective  of  our  simulation. 
Rather,  the  data-base  of  subjective  mass  functions  is  required  in  order  to  begin  the  simu¬ 
lation  process  that  can  be  summarized  in  the  five  steps  shown  in  Figure  5. 

We  recall  to  the  readers  attention  that  one  of  our  our  motives  for  using  the  DS  theory 
is  due  to  its  increased  ability  to  deal  with  limited  evidential  information.  To  the  extent  a 
KS’s  opinion  is  modeled  with  respect  to  these  three  characteristics  of  information,  we  will 
be  able  to  evaluate  the  viability  of  the  DS  theory  in  our  task  domain.  Therefore,  what  we 
are  truly  simulating  is  the  degradation  of  a  KSs  opinion  (i.e.,  mass  function)  with  respect 
to  certainty,  precision,  and  accuracy. 

The  degradation  process  can  be  modeled  as  a  function,  D  ,  of  five  parameters,  Qq  , 
a  mass  function  M,  a  certainty  factor  (cer),  a  precision  factor  ( pre ),  and  an  accuracy 
factor  ( acc ).  In  equation  form: 

D  (0q,  M,  cer,  pre,  acc )  =  M1 ,  (27) 

where  M'  is  the  degraded  mass  function.  The  cer ,  pre ,  and  acc  parameters  specify 
the  extent  to  which  M  is  to  be  degraded  with  respect  to  the  three  characteristics  of 
information.  Let  us  provide  our  intuition  and  computational  definition  of  how  a  mass 
function  M  is  degraded  to  M1 . 
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A  KS  should  attribute  a  greater  portion  of  its  unit  mass  to  a  proposition,  say  p  c 
©g  ,  the  more  certain  it  is  about  the  truthfulness  of  that  proposition.  Conversely,  a 
proportionately  smaller  amount  of  mass  should  be  attributed  to  p  if  a  KS  is  less  certain 
about  the  proposition’s  truthfulness.  In  our  simulation  of  the  degradation  in  a  KS’s  mass 
function,  the  cer  parameter  is  used  to  determine  the  degree  to  which  a  KS’s  opinion 
should  be  made  less  certain  -  i.e.,  how  much  to  reduce  the  amount  of  mass  that  has  been 
attributed  to  a  proposition.  This  is  reflected  in  the  following  equation.  For  0  <  cer  <  1 , 
the  degree  to  which  a  KS’s  mass  function  M  is  to  be  degraded  with  respect  to  the  certainty 
of  a  proposition  p : 


M'{p)  - 


cer*M(p),  for  p  c  0g; 

M(©g)  +  £(1  -  cer)  *  M(p),  for  p  =  ©g. 


(28) 


Notice  that  the  amount  of  mass  that  was  originally  attributed  to  total  ignorance  (i.e., 
©g  )  is  increased  by  the  sum  of  the  mass  that  was  “taken”  away  from  proper  subsets  of 
©g  .  Doing  so  insures  that  the  constraint  in  equation  11  remains  satisfied. 

A  KS  that  attributes  a  non  zero  amount  of  mass  to  a  singleton  in  a  frame  is  said  to 
be  expressing  the  most  precise  opinion  possible  with  respect  to  that  frame.  Conversely,  a 
KS  that  attributes  a  non  zero  amount  of  mass  to  ©  is  expressing  the  least  precise  opinion 
possible.  That  is,  attributing  any  non  zero  mass  to  a  set  of  cardinality  one  is  expressing 
a  very  precise  opinion.  And  the  precision  of  that  opinion  decreases  as  the  cardinality  of 
that  set  increases.  The  pre  parameter  in  our  simulation  process  controls  the  cardinality 
of  a  proposition  p  -  i.e.,  its  corresponding  subset  of  ©g  .  For  0  <  pre  <  1 ,  the  degree  to 
which  a  KS’s  mass  function  M  is  degraded  with  respect  to  a  proposition  p  is  given  by: 


Af'(p')  =p(J  ran-set-gen(&Q  -p,  pre),  (29) 

where  for  some  set,  s  c  ©g  ,  ran-set-gen  (s,  pre)  returns  a  random  set  s'  C  ©g  of 
cardinality  (1  — pre)  *  |s|  .  Since  our  system  does  not  have  any  particular  knowledge  about 


33 


how  a  KS’s  opinion  becomes  less  precise  the  ran-set-gen  function  randomly  selects  the 
propositions  to  include  in  s  . 

A  KS  is  said  to  be  expressing  an  inaccurate  opinion  if  it  attributes  a  non  zero  amount 
of  mass  to  any  proposition  that  is  not  true  with  respect  to  the  available  evidence.  Fur¬ 
thermore,  the  more  mass  it  attributes  to  a  false  proposition  the  more  it  is  in  error.  In  our 
simulation  scheme,  for  0  <  acc  <  1 ,  the  degree  to  which  a  KS’s  opinion  is  accurate,  the 
degradation  of  that  opinion  with  respect  to  accuracy  is  given  by: 


M'(->p)  =  (1  —  acc)  *  M{p ); 
M\p)  —  acc  *  M(p). 


(30) 


We  argue  that  the  cer  ,  pre ,  and  acc  parameters  allow  our  simulator  to  model  most, 
if  not  all,  of  the  ways  opinions  might  vary  when  expressed  in  terms  of  propositions  in  a 
frame  of  discernment.  By  specifying  these  three  parameters,  it  is  possible  to  characterize 
any  degradation  in  an  opinion  that  might  be  expressed  by  any  real  or  imaginary  KS. 

Returning  to  Figure  5,  the  process  of  simulating  the  invocation  of  KSs  involves  first 
retrieving,  from  an  image  data-base,  a  mass  function  for  each  KS  that  is  invoked.  Next, 
each  of  these  mass  functions  is  degraded  with  respect  to  the  cer  ,  pre  ,  and  acc  parameters. 
The  result  is  a  set  of  degraded  mass  functions  that  are  then  pooled  using  Dempster’s  rule. 
Finally,  the  consensus  opinion  formed  by  Dempster’s  rule  is  input  to  an  inference  engine 
that  updates  the  Spt  and  Pis  of  propositions  in  Qq  . 

5.3.3  Long  Term  Memory.  The  frame  Qq  is  the  system’s  relatively  static  rep¬ 
resentation  of  the  world  and  domain  knowledge  that  is  needed  to  “understand”  images 
-  commonly  called  long  term  memory  (LTM)  [41],  [23].  LTM  is  the  representation  of 
the  semantic  relationship  between  observable  features  and  the  visual  entities  the  system 
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might  try  to  discern.  In  EHCVIS,  the  set  of  label  hypotheses  (i.e.,  propositions)  in  LTM 
is  defined  to  be: 

Qq  =  {tree-crown-scene,  sky-scene,  shutters- scene,  roof-Bcene, 
road-scene,  residential-scene,  house-scene,  bush-scene, 

(31) 

Puffton-house-scene,  Griff  ith-house- scene,  Brown-house-scene, 
front-wall-scene,  side-wall-scene,  grass-scene}. 

The  propositions  Puifton-house-scene,  Griff ith-house-scene,  and  Brown-house- 
scene  represent  particular  house  scenes  that  are  associated  with  a  particular  individual 
or  place.  This  is  in  contrast  to  a  generic  house  scene  as  represented  by  the  proposition 
house-scene.  The  reason  -scene  appears  as  a  suffix  to  the  above  propositions  such  as 
roof-scene,  sky-scene  is  that  for  a  particular  image  or  subset  of  regions  in  an  image, 
the  system  might  only  be  observing  these  objects.  How  a  conjunction  of  label  hypotheses 
that  are  not  explicitly  represented  in  Qq  can  be  instantiated  in  STM  is  explained  in  [44]. 

There  are  five  feature  spaces,  Fi  through  F$  ,  any  subset  of  which  might  be  used  to 
partition  Qq  .  Enumerating  all  of  these  feature  spaces  and  their  corresponding  sets,  J\ 
through  ?5  ,  of  feature  propositions  would  be  excessive  for  this  paper.  However,  we  shall 
enumerate  a  subset  of  LTM  to  help  make  the  remaining  discussion  more  lucid. 

The  following  three  feature  spaces  are  associated  with  the  indicated  types  of  visual 
information  that  might  be  used  to  discern  propositions  in  Qq  : 

Fi  Objects:  such  as  tree  crowns,  roofs,  roads,  and  so  on; 

F2  Spectral:  such  as  grass  green,  sky  blue,  road  black,  road  grey,  and  so  on; 

Fz  Texture:  such  as  highly  textured  grass,  smoothly  textured  sky,  and  so  on. 

The  following  is  an  enumeration  of  some  of  the  feature  propositions  in  the  above 


feature  spaces: 
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7\  has-sky-as-part ,  has-grass-as-part ,  has-walls-as-part ,  ... 

7i  has-sky-blue-as-part ,  has-grass-green-as-part ,  ... 

7z  has-bush-angular-line-density-as-part , 
has -house -angular-line-density-as-part , 
has-sky-angulax-line-density-as-part ,  .... 

The  reason  the  spectral  feature  proposition  has-sky-blue-as-part,  for  instance,  speci¬ 
fies  the  object  sky  is  due  to  two  reasons.  The  first  is  that  there  is  no  universally  standard 
quantification  of  the  color  blue  in  an  image  that  was  produced  by  some  uncalibrated  pho¬ 
tographic  process.  Such  photographic  images  are  commonly  used  to  generate  a  digitized 
image  of  the  original  scene.  The  second  is  that  without  this  calibrated  information,  the 
only  way  to  currently  capture  some  measure  of  “blueness”  is  to  sample  a  collection  of 
regions  in  images  that  contain  only  blue  skys.  As  a  consequence,  what  one  has  actually 
measured  is  not  blueness,  rather  sky-blueness.  And  in  a  similar  fashion,  one  can  only 
measure  grass-green,  gras s-blueish- green,  road-grey,  and  so  on. 

For  each  7{  we  must  construct  a  Xt  to  partition  Qq  .  Again,  a  complete  enumeration 
is  excessive,  however  we  shall  list  a  subset  of  the  xs  actually  defined  in  EHCVIS.  For  7\  : 


Xi (has-sky-as-part)  ={sky-scene,  road-scene, 

residential- scene,  Puff ton-house- scene, 
Griff ith-house-scene,  Brown-house -scene}; 

Xi(has-grass-as-part)  ={grass-scene,  road-scene, 

residential-scene,  Puff ton-house -scene, 
Griff ith-house-scene,  Brown-house- scene}, 


(32) 
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and  so  on.  And  finally  for  T 3 : 


X3  (has— tree -crown-angular-line-density-as-part)  = 
{tree-crovrn-scene,  road-scene, 
residential-scene,  Puiiton-house-scene, 
Gr if  f ith-hous  e - s  c  ene,  Brown-house-scene}; 

X3  (has -grass -angular- line -density- as -part)  = 

{grass-scene,  road-scene, 

residential- scene,  Puff ton-house -scene, 

Griff ith-house- scene,  Brown-house-scene}, 


(33) 


and  so  on. 

The  output  of  the  simulation  process  is  a  set  of  degraded  mass  functions.  These 
mass  functions  are  then  combined  by  Dempster’s  rule  to  form  a  consensus  about  which 
label  hypothesis  is  appropriate  to  associate  with  the  current  region  of  interest.  The  result 
of  applying  this  rule  is  input  to  an  inference  engine  that  updates  the  Spt  and  Pis  of 
propositions  in  LTM.  After  the  updating  is  completed  the  results  are  evaluated  in  phase 
four. 

5.4  PHASE  FOUR: 

5.4.1  Evaluating  LTM,  The  evaluation  of  LTM  and  the  state  of  the  system  up 
to  this  point  can  be  characterized  in  four  steps. 

STEP  1:  The  first  step  involves  determining  if  there  was  sufficient  conflict  between 
KSs  to  justify  verifying  the  KSs.  The  details  of  the  verification  process  in 
EHCVIS  is  complex  -  see  [44],  However,  if  the  KSs  have  been  verified  or 
the  conflict  they  generate  is  below  some  threshold,  then  proceed  to  Step  2. 
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STEP  2:  In  the  second  step,  the  system  tries  to  determine  if  the  consensus  opinion 
that  was  formed  over  LTM  is  sufficient  to  justify  instantiating  a  label  hy¬ 
pothesis  -  i.e.,  the  Spt  of  one  or  more  label  hypotheses  is  above  some  min¬ 
imum  threshold.  If  so,  then  instantiate  the  hypotheses  in  a  representation 
called  short  term  memory  (STM).  Then  place  on  the  goal  stack  the  goal 
of  looking  for  objects  that  can  possibly  coexist  with  the  object  hypotheses 
that  were  previously  instantiated  in  STM,  and  then  go  to  PHASE  ONE, 
else  go  to  Step  3. 

STEP  3:  If  the  system  reaches  this  step,  then  it  is  trying  to  reduce  the  amount 
of  ambiguity,  dissonance,  or  ignorance  for  a  subset  of  label  hypotheses  in 
Qq  .  If  the  maximum  number  of  attempts  to  instantiate  a  hypothesis  has 
not  been  reached,  then  the  goal  of  obtaining  additional  information  about 
the  currently  best  label  hypotheses  is  put  on  the  goal  stack.  Then  the 
system  proceeds  to  PHASE  ONE  else  to  Step  4. 

STEP  4:  This  step  is  reached  if  EHCVIS  has  exhausted  the  maximum  number  of 
attempts  to  instantiate  a  hypothesis.  At  this  point,  the  goal  of  identifying 
objects  the  system  has  the  best  chance  of  discerning  is  put  on  the  goal 
stack,  Then  the  system  proceeds  to  PHASE  ONE  else  the  interpretation 
process  is  terminated  if  the  maximum  number  of  attempts  at  interpreting 
the  entire  image  is  exceeded. 

It  is  clear  that  the  evaluation  phase  of  our  system  plays  an  important  role  in  controlling 
the  interpretation  task.  Indeed,  the  selection  of  goals  and  their  constraints  is  dependent 
on  factors  that  our  system  does  not  yet  taken  into  account.  Some  of  the  limitations  and 
consequences  of  this  are  discussed  in  [44]. 

This  completes  the  discussion  of  EHCVIS  and  how  both  the  DS  and  ER  technologies 
have  been  integrated  into  the  system’s  mechanisms  for  reasoning  about  the  control  of  the 
interpretation  process  and  reasoning  about  the  visual  entities  it  is  expected  to  perceive. 
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Next  we  shall  briefly  discuss  the  interpretation  experiments  that  were  conducted  and 
summarise  their  results. 

6  INTERPRETATION  EXPERIMENTS 

There  are  several  objectives  of  the  research  and  experiments  that  are  reported  here 
and  in  [44]: 

1)  to  demonstrate  that  certain  types  of  incompletenesses  in  LTM  can  be  detected 
when  certain  “evidential  measures”  and  verification  procedures  are  employed; 

2)  to  demonstrate  that  the  system  tends  to  degrade  smoothly  as  the  quality  of  the 
information  it  must  reason  from  becomes  less  certain,  precise,  and  accurate,  and; 

3)  to  demonstrate  that  a  system’s  performance  is  improved  (i.e.,  fewer  resources  are 
used,  better  interpretations,  or  a  combination  of  the  above)  as  more  evidential- 
based  control  strategies  are  used. 

In  this  paper,  we  shall  begin  to  emphasize  the  later  two  objectives  by  describing  our 
experimental  design,  method,  and  results. 

Over  one  hundred  and  forty  interpretation  experiments  were  conducted  on  three  dig¬ 
itized  and  segmented  color  monocular  images  of  outdoor  natural  scenes  that  are  similar 
to  Figure  1.  Each  experiment  involved  selecting  a  value  for  each  of  the  three  degradation 
parameters  and  then  tasking  EHCVIS  to  interpret  the  image.  For  instance,  we  typically 
started  a  aeries  of  experiments  with  cer  =  pre  =  acc  =  1 ,  then  after  the  system  did  its 
best  at  completing  the  interpretation  task,  the  degradation  parameters  we  set  to  cer  =  .9 , 
and  pre  =  acc  =  1,  then  cer  =  .8,  and  pre  —  acc  —  1,  ...,  cer  =  l,pre  =  .9, acc  =  1, 
and  so  on  until  cer  =  pre  =  acc  equaled  .4  or  .5 .  This  sequence  of  degradation  in  the 
three  parameters  was  conducted  twice  for  each  image.  That  is,  once  without  using  any 
control  strategies  to  establish  a  baseline  level  of  performance.  The  remaining  times  the 
system  was  allowed  to  employ  various  combinations  of  control  strategies.  This  allowed 
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us  to  compare  performance  between  the  use  of  various  control  strategies  and  no  control 
strategy. 

For  each  experiment,  the  KSs  in  the  system  would  be  most  certain,  precise,  and 
accurate  when  cer  ,  pre ,  and  acc  equaled  one,  respectively.  Conversely,  the  KSs  became 
less  certain,  precise,  and  accurate  as  cer ,  pre  ,  and  acc  approached  zero,  respectively. 
Experiments  were  not  conducted  with  parameter  values  for  which  it  was  clear  the  system 
would  not  be  capable  of  interpreting  the  image. 

Several  metrics  were  used  to  measure  the  system’s  performance,  one  being  the  number 
of  correctly  instantiated  regions.  But  before  we  discuss  the  system’s  performance,  let 
us  briefly  annotate  a  portion  of  the  system’s  attempt  at  interpreting  the  image  and  its 
segmentation  in  Figures  1  and  2  respectively. 

6.1  ANNOTATED  INTERPRETATION  TASK 

The  portion  of  an  interpretation  experiment  described  here  is  intended  to  demonstrate 
one  important  point.  That  by  taking  advantage  of  the  additional  information  the  DS 
makes  readily  available,  our  system  was  able  to  identify  objects  that  were  previously 
undiscemible  when  this  information  was  unavailable. 

Consider  Figures  6  through  12.  At  a  point  early  in  the  process  of  trying  to  interpret 
region  of  the  segmented  image  in  Figure  6,  the  system  was  unable  to  disambiguate 
whether  that  region  is  the  side-wall-scene  or  the  front -wall- scene  of  the  house  in 
the  image.  During  this  experiment,  the  cer ,  pre ,  and  acc  parameters  were  set  to  1 , 
.7 ,  and  1 ,  respectively.  That  is,  all  the  KSs  were  as  certain,  and  accurate  as  possible, 
however,  they  were  made  30%  less  precise  than  they  could  be.  Figure  7  shows  that  after 
reaching  the  maximum  number  of  attempts  to  interpret  the  region  the  system  remained 
most  ambiguous  about  the  side-wall-scene  and  front-wall- scene  label  hypotheses. 
Thus,  without  using  a  control  strategy  to  help  resolve  this  ambiguity  the  object  in  region 
was  not  identified.  As  illustrated  in  Figure  8,  the  only  way  the  system  would  be 
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capable  of  resolving  this  ambiguity  would  be  to  obtain  information  that  distinguishes 
side-wall-scenes  from  iront -wall- scene. 

As  mentioned  earlier,  as  part  of  its  specification,  each  KS  can  typically  only  attribute 
mass  to  a  subset  of  the  feature  propositions  in  some  feature  space.  In  Figure  9,  we  see 
that  KSq  can  attribute  mass  for  or  against  the  has-walls-as-part,  has-side-walls- 
as-part,  and  has-front-wall-as-part  feature  propositions.  In  contrast,  we  also  see 
in  Figure  9  that  the  only  feature  proposition  KS\  can  attribute  mass  for  or  against  is 
has-house-as-part.  Therefore,  at  this  point  in  the  interpretation,  KSq  appears  to  be 
better  suited  for  resolving  the  ambiguity  of  current  interest. 

However,  by  allowing  the  system  to  use  its  ambiguity  resolving  control  strategy,  we 
can  begin  to  see  in  Figure  10  how  the  system  might  be  able  to  label  ^4  .  In  Figure  10, 
the  two  control  strategies  used  in  this  experiment  are  underlined  at  the  top  of  the  figure. 
The  two  CKSs  that  are  responsible  for  measuring  Igr ,  and  Amb  are  CKS  2  and  CKS  4 
respectively.  The  set  of  possible  actions  the  system  might  take  (i.e.,  )  is  enumerated 

in  the  list  under  the  title  “Primed  *action-prop-names.”  These  alternatives  were  the 
same  as  those  available  to  the  system  when  the  above  control  strategies  were  not  used. 
After  both  CKS  2  and  CKS  4  have  made  their  respective  measurements,  they  construct  mass 
functions  that  reflect  their  opinion  about  which  alternatives  they  believe  is  the  best  to 
pursue.  We  see  that  CKS  2  believes  very  strongly  that  taking  those  actions  that  invoke  KSq 
is  more  appropriate  than  taking  those  actions  that  do  not  invoke  KSq  .  Likewise,  CKS  4 
believes,  almost  as  strongly,  the  same  as  CKS  2  .  We  see  in  Figure  11  the  result  of  pooling 
these  two  opinions  over  6^4  .  That  is,  the  consensus  opinion  strongly  indicates  that  taking 
either  action  ai  or  a2  is  appropriate  because  they  result  in  tasking  KSq  ,  which  has  the 
best  chance  of  resolving  the  ambiguity  of  concern.  The  results  of  the  system  actually 
pursuing  a<i ,  which  was  randomly  chosen  from  {aj,  a2}  ,  is  illustrated  in  Figure  12.  The 
opinion  of  KSq  was  such  that  it  supported  a  proposition  that  distinguished  front-wall- 
scene  from  side -walls -scene  to  a  degree  that  allowed  the  system  to  instantiate  the 
front-wall-scene  label  hypothesis  for  /Z14  . 
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There  was  no  guarantee  that  K S$  would  have  helped  discern  the  propositions  of 
interest.  Rather,  among  the  available  KSs  it  was  the  most  likely  to  provide  the  system 
with  the  needed  information.  The  ambiguity  control  strategy  biased  the  system  to  take 
those  actions  that  were  most  likely  to  result  in  obtaining  the  desired  information. 

The  portion  of  an  actual  experiment  just  presented  illustrates  how  EHCVIS  uses  the 
DS  and  ER  technology  to  accomplish  two  major  interpretation  tasks  that  were  described 
earlier  in  this  paper:  l)  to  reason  about  what  label  hypotheses  to  assign  to  regions  in  an 
image  and;  2)  to  decide  how  its  limited  resources  should  be  utilized  in  order  to  complete 
the  image  interpretation  task.  A  number  of  experiments  were  conducted  using  a  variety 
of  control  strategies  (e.g.,  reliability  of  KSs,  dissonance  resolving  control  strategies,  and 
so  on)  in  conjunction  with  various  combinations  of  degradation  parameter  values. 

The  results  of  the  experiments  we  have  conducted  can  be  and  are  presented  in  a 
number  of  ways,  see  [44].  Here,  we  shall  summarize  these  results  with  respect  to  one 
performance  measure:  the  number  of  correctly  instantiated  regions.  In  addition,  the 
results  presented  in  this  section  are  with  respect  to  experiments  on  the  image  in  Figure  1. 
However,  the  results  for  the  remaining  two  images  are  similar  to  those  presented  here. 

In  summary,  when  all  the  KSs  were  as  certain,  precise,  and  accurate  as  possible,  (i.e., 
cer  =  pre  =  acc  =  1 ),  and  no  control  strategies  were  used,  the  system  was  able  to  correctly 
label  approximately  90%  of  the  regions  it  examined.  When  all  the  KSs  were  as  certain, 
precise,  and  accurate  as  possible  and  the  system  was  allowed  to  use  any  number  of  control 
strategies,  the  system  was  able  to  correctly  label  approximately  91%  to  92%  of  the  regions 
examined.  This  suggests  that  “evidential  control  strategies”  do  not  significantly  improve 
a  system’s  performance  when  its  sources  operate  at  optimum  levels.  However,  as  the  KSs 
became  less  certain,  but  remained  as  precise,  and  accurate  as  possible,  and  no  control 
strategies  were  used,  the  system  was  able  to  correctly  label  only  approximately  23%  of 
the  regions  examined  for  a  40%  decrease  in  certainty.  But  when  the  system  was  allowed 
to  use  any  number  of  control  strategies,  it  was  able  to  correctly  label  as  many  as  70% 
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of  the  regions  examined  for  the  same  40%  decrease  in  just  the  certainty  of  a  KS’s  mass 
function.  The  level  of  performance  was  qualitatively  the  same  when  the  mass  functions  of 
KSs  were  degraded  with  respect  to  just  accuracy  or  just  precision.  The  degree  to  which 
the  system’s  performance  was  improved  when  the  mass  functions  of  KSs  were  degraded 
with  respect  to  certainty,  precision,  and  accuracy  was  not  as  dramatic  as  the  results  just 
described  indicate.  However,  the  improvement  that  was  noticed  was  significant  enough  to 
justify  using  these  “evidential”  control  strategies. 

In  short,  the  results  indicate  that  although  taking  advantage  of  the  information  the  DS 
theory  provides  does  not  significantly  improve  a  KBS’s  performance  when  its  perceptions 
are  near  perfect.  The  benefits  of  using  such  information  becomes  obvious  as  the  quality 
of  a  KBS’s  perceptions  degrade.  That  is,  the  degradation  of  the  system’s  performance  is 
significantly  delayed. 

7  REASONING  IN  COMPUTER  VISION  SYSTEMS:  RELATED  WORK 

There  are  some  important  similarities  and  differences  between  our  approach  to  reason¬ 
ing  from  limited  evidential  information  and  that  used  by  others  -  see  for  instance  Nagao 
and  Matsuyama  [28],  Brooks  [4],  Peter  Selfridge  [33],  Kenneth  Sloan  [37],  Thomas  Garvey 
[13],  Hanson  and  Riseman  [19],  and  Levine  and  Shaheen  [24]. 

The  object  recognition  portion  of  Nagao’s  and  Matsuyama’s  system  uses,  in  part, 
a  boolean  approach  to  reasoning  about  the  perceptions  of  its  KS-like  feature  extraction 
processes.  Their  approach  is  similar  to  ours  in  that  semantic  knowledge  about  objects  are 
represented  in  terms  of  object-features  that  can  and  cannot  coexist  -  e.g.,  see  table  6.1  in 
[28].  The  approaches  differ  in  how  beliefs  about  the  presence  or  absence  of  object-features 
in  a  region  of  interest  are  represented,  pooled,  and  how  inferences  are  drawn  from  these 
beliefs.  In  Nagao’s  system  beliefs  about  the  presence  or  absence  of  any  particular  object 
feature  in  some  region  of  interest  is  represented  in  a  Boolean  “yes”  or  “no”  manner.  This 
boolean  decision  is  made  in  the  source  (e.g.,  KS)  that  must  express  its  beliefs.  These  beliefs 
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are  then  pooled  in  a  logical  fashion  to  infer  which  label  hypothesis  should  be  instantiated 
for  the  region  under  examination.  However,  the  difficulties  of  reasoning  in  a  Boolean 
fashion  have  been  discussed  in  [25],  [26].  In  contrast,  KSs  in  our  system  express,  on  a 
continuous  scale,  their  partial  beliefs  about  the  presence  or  absence  of  object-features  in 
a  region.  And  we  have  previously  pointed  out  the  benefits  of  providing  KSs  with  this 
flexibility. 

The  work  of  Brooks  [4],  Peter  Selfridge  [33],  Kenneth  Sloan  [37],  Thomas  Garvey  [13], 
Hanson  and  Riseman  [19],  Levine  and  Shaheen  [24],  Yakimovsky  and  Feldman  [46],  and 
Zucker  [48]  for  the  most  part  employ  mechanisms  that  are  probabilistic,  Boolean,  or  an 
ad  hoc  variant  thereof  for  pooling  beliefs  and  drawing  inferences.  Therefore,  it  is  difficult 
if  not  impossible  for  their  systems  to  take  advantage,  in  a  nice  formal  way,  of  evidential 
measures  such  as  the  amount  of  ignorance,  dissonance,  ambiguity,  decisiveness  and  so  on 
a  proposition  might  exhibit.  This  work  suggests  that  the  performance  of  their  systems 
might  improved  if  they  take  advantage  of  such  evidential  information. 

8  SUMMARY 

In  this  paper,  we  have  discussed  research  on  the  application  of  both  the  Dempster- 
Shafer  theory  and  the  concept  of  evidential  reasoning  in  order  to  begin  addressing  several 
problems  that  KBSs  must  deal  with.  Our  domain  of  application  was  knowledge-based 
computer  vision.  The  DS  theory  and  concept  of  ER  is  the  foundation  of  a  developing 
framework  for  knowledge-based  systems,  such  as  general  purpose  computer  vision  sys¬ 
tems,  that  must  reason  in  complex  domains  about  both  their  perceptions  and  the  actions 
they  might  pursue  in  order  to  understand  their  environment.  Some  results  from  a  large 
number  of  interpretation  experiments  were  summarized  to  highlight  a  few  of  the  benefits 
of  employing  these  technologies  in  a  large  scale  knowledge- based  system.  That  is,  by  using 
previously  unavailable  information  such  as  the  amount  of  dissonance,  ignorance,  and  or 
ambiguity  a  label  hypotheses  exhibits,  the  system  was  able  to  correctly  label  a  significantly 
greater  number  of  regions  in  an  image. 
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Despite  the  progress  of  this  research,  there  remains  a  significant  number  of  problems 
to  address  with  respect  to  the  technology  we  have  explored  and  its  use  in  knowledge- 
based  systems.  For  instance,  although  the  DS  theory  has  relieved  us  from  the  burden 
of  specifying  complete  probability  models,  a  formal  theory  for  generating  mass  functions 
remains  unavailable.  We  believe  that  this  later  problem  is  more  tractable  than  the  former. 
Another  concern  is,  given  the  independence  requirement  of  Dempster’s  rule,  is  there  a 
formal  model  by  which  dependencies  can  be  automatically  accounted  for  in  a  frame  of 
discernment?  And  finally,  but  not  the  least  of  which  is,  the  lack  of  a  computational 
theory  for  the  integration  of  “fuzzy-based”  approaches  to  uncertain  reasoning  with  the 
theory  of  belief  functions  [47]. 
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Figure  1. 


A  MONO-CHROMATIC  RENDERING  OF  A  TYPICAL  STATIC  2-D  COLOR  IMAGE 
OF  AN  OUTDOOR  NATURAL  SCENE. 


Figure  2. 


AN  EXAMPLE  SEGMENTATION  OF  THE  IMAGE  IN  FIGURE  I. 


PHASE  1 
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PHASE  2 


PHASE  4 


•  Goals 

•  Generate  Alternatives 

•  Control  Strategies 

£ 


•  Build  Control  Knowledge  (i.e.,  eA) 

•  Obtain  Control-Related  Info 

(i.e.,  invoke  CKSs) 

•  Pool  opinions  of  CKSs 

•  Choose  Which  Action  to  Pursue 

^  ± 

•  Pursue  Chosen  Alternative 

•  Simulate  Invocation  of  KSs 

•  Pool  Opinions  of  KSs 

•  Draw  Inferences  over  LTM 


£ 


,  PHASE  3 


•  Evaluate  Inference  results  in  LTM 

•  Generate  New  Goals 

•  Instantiate  Hypotheses  in  STM 

•  Terminate  Interpretation  Process 


ti 


[sTM 

d 

•  b  6 

Figure  3. 


A  SYSTEM  FLOW  DIAGRAM  OF  EHCVIS 


GOAL-NAME/ID  KS  selection  constraints 

(starC-up  ((rel  0.7  1)...) 

r((conj  (loc-above  0  200  0  200) 
(size  10000  65000))...)) 


Region  selection  constraints 


KSs  that  sat¬ 
isfy  the  constraint 


Regions  that  satisfy  the  constraints 


Figure  4. 


EXAMPLE  GOAL  AND  CONSTRAINTS. 
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Figure  5. 


SIMULATING  KS  INVOCATIONS. 


System  is  trying  to  interpret  region  #14. 


Figure  6. 


SEGMENTATION  OF  AN  IMAGE  THE  SYSTEM  IS  TRYING  TO  INTERPRET. 
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Without  ambiguity  control  Btrate^v: 

preciseness  parameter  value  »  .7 

invoke "KS801KS704KS30 

"R14  «■>  (CKS0O  KS70  KS30) 

(RIO) 

ltm  inference  results 

■3=;CCBB8BS88SBBIRIOB 

■8D5BBBBBC8ICritBBB3B9BB>B>B 

tree-crovn-scene 

[0.0  .  0.0]  fe . 

sky-scene 

[0.0  .  0.0]  !• - 

side-valls-scene 

[0.0  ,  1.0] 

shutters-scene 

[0.0  .  0.0]  !• . 

roof -scene 

[0.0  .  0.0]  !• . 

road-scene 

[0.0  .  0.0]  I* . 

. 1 

puffton-house-scene 

[0.0  .  l.e-3]  !• . 

house-scene 

[0.0  ,  l.e-3]  I* - 

griff ith-house-scene 

[0.0  .  l.e-3]  !• . . 

grass-scene 

[0.0  .  3.6e-2]  !• . 

- 1 

front-vall-scene 

[0.0  .  1.0]  !••••••••*•♦« 

bush-scene 

[0.0  .  0.0]  !• . 

br o vn -house - s c ene 

[0.0  .  0.0]  1* - 

a-road-scene 

[0.0  .  0.0]  !• - 

— - 1 

Figure  7. 


CURRENT  STATE  OF  INTERPRETATION  PROCESS  WITHOUT  USING  ANY  CON¬ 
TROL  STRATEGY. 
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Need  information  that  distinguishes  side  walls  from 
front  walls. 


Figure  8. 


ILLUSTRATION  OF  INFORMATION  NEEDED  TO  DISCERN  AMBIGUOUS  LABEL 
HYPOTHESES. 
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Information  KSb  can  possibly  provide! 


(K56 

Property  List:  (preconditions:  all 


typs:  object 


KS-language-props : 

(”  (has- welle -es -part 
<  has-slde-valls-as-part 
has-front-vall-ss-part) 


cnr-certalnty-prob:  1 
CUT*  preciseness-,-prol):  .7 
cur-accuracy-prob;  1  ) 


(KS1 


C=:3:5SSS33lB3i:8=:ClI303BBBlBCISCIBlDBai«ltKSaBI 


Property  List:  (preconditions:  ail 

typs:  objsct 

KS-language-props : 

^  £ (has-houss-as-part) 


cnr-certalnty-prob:  1 
CUT-  preciseness  -prob:  .7 
cnr-accnracy-prob:  1  ) 


Figure  9. 


KS  SPECIFICATIONS. 
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Control  strategical  CKS_masB-functlonB: 

TYPE  OF  INFORMATION  REPORTED:  *itntaEiai 
(noet-igr- about -prop  best««mbiguity-reeolving-ke-ie) 

KOO  BSOIBUII ■■■■■■■■■BBlIliailllllilBlM 


TYPE  OF  INFORMATION  REPORTED:  Pruned  **ction-prop-n*oee 

( invoke "K570AKS30ftKS80ftKS90“R14  invoke-KS6AKS30AKS80ftKS&0-Ri4 
invoke “KS70AKSB0AKS30'R14  invoke *KS6AKS704KS30AKS90"R14) 


TYPE  OF  INFORMATION  REPORTED:  The  Bees  functions  returned  by 

the  lest  invoked  CKSb 

( (CK52  <=»  most-igr-about-prop 
( ( (invoke"KS6AKS706KS306KS90"R14  invoke"KS70AKS804KS30'Ri4 
invoke-KS64KS304KS80&KS90"Ri4 
invoke"KS70&KS304rKS80AK590*’R14)  .  1) 

( (invoke *KS8AKS70tKS30AKS90'R14  T 

invoke‘KS6AKS304KS80ftKS90'R14)  .9)))  j 


(CKS4  <*■  ^^ajnbi^r^resoIvin^ks-i8 
( ( (invoke'KS8iKS304KS804KS90'R14 

invoke'KS6AKS70ftXS30*XS90“R14)  0.6667) 


] 


( (invoke “KS70AKS30AKS804KS90‘R14  invoke*KS70ftKS80MCS30’Rl4 
Invoke ~KS8AK530ftKS80AXS90~Rl4 
invoke“KS6AKS70&KS30AKS90“Rl4)  0 . 333) ) ) ) 

■  &8IIB3Bll8IIRIBIOBIB«tmilBII|lBSItN 


Figure  10, 

OPINIONS  FROM  IGNORANCE  AND  AMBIGUITY  CKSs. 


Combining  CKS2’s  &  CKS4’s  mass-functions: 


a,  =  invoke -KS64KS70&KS304KS90-R14 
O,  =  invoke-KS64KS30&KS80*KS90-R14 
a3  =  invoke-KS70AKS30&KS80SKS&0-Rl4 
aK  =  invoke -KS70&KS80&KS30-R14 
@A=  ^VOjVOjVd, 

F,:  best-ambiguity-resolving-ks 
Fa:  most-igr-about-prop 


M«  =  ((<2,  V  O,)  .6667)- 
(@4  -3333))^_ 

Me* » ((Qi  V  <h)  -066} 

(Oj 4  .034)) 


M  =  ((a,Va,)  .9) 

(^A  -1)) 


TO 


POOLING  THE  OPINIONS  OF  THE  IGNORANCE  AND  AMBIGUITY  CKSs. 
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RcsuIts_of  taking  action: 

TYPE  OF  INFORMATION  REPORTED:  action  ayatan  vill  taka 
invoke~KS6&KS30AKSB0AKS90_Bi4  <«*  CCKS6  KS30  KSBO  KS90)  (R14)) 


I3  =  :SBSBBXS>=3IBIII»||CBR|I 


ltm  inference  results 


SBSSSBSSSBSCSSBBBSBSBKSSBISBBItBIBBBIB 


tree-crovn-acene 

aky-acene 

aide-walla-acene 

abuttera-scena 

roof-acena 

road-acene 

puffton-houae-acene 

houae-acene 

griff ith-house-acene 

grass-scene 

front -vail -a cane 

buah-acena 

brovn-hous  a  * acana 

a-road-acona 


[0.0  .  l.a-3] 
[0.0  .  l.a-3) 
[0.0  .  0.3] 

[0.0  .  0.0] 

[0.0  .  l.a-3] 
[0.0  .  l.a-3) 
[0.0  ,  3.a-3] 
[l.e-3  .  4.e-3) 
[0.0  .  3.a-3) 
[0.0  .  l.a-3] 
[0.606  .  0.099] 
[0.0  .  l.a-3) 
[0.0  .  3.8-3) 
[0.0  .  l.a-3] 


«  BBBBSSCIBEBCCIIIIBIBIIBIBBBBIIIBBIBBBBIIIBII BlISBBiaiBIBlBBlB 


TYPE  OF  INFORMATION  REPORTED: 

inatantiated  hypothaaia,  instantiated  regiona, 

(Cfront-vall-acene)  R14) 


Figure  12. 


RESULT  OF  USING  AMBIGUITY  CONTROL  STRATEGY. 
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